llvm-project

Commit Graph

Author	SHA1	Message	Date
Chuang-Yu Cheng	5663848996	[Power9] Implement new vsx instructions: quad-precision move, fp-arithmetic This change implements the following vsx instructions: - quad-precision move xscpsgnqp, xsabsqp, xsnegqp, xsnabsqp - quad-precision fp-arithmetic xsaddqp(o) xsdivqp(o) xsmulqp(o) xssqrtqp(o) xssubqp(o) xsmaddqp(o) xsmsubqp(o) xsnmaddqp(o) xsnmsubqp(o) 22 instructions Thanks Nemanja and Kit for careful review and invaluable discussion! Reviewers: hal, nemanja, kbarton, tjablin, amehsan http://reviews.llvm.org/D16110 llvm-svn: 264565	2016-03-28 07:38:01 +00:00
NAKAMURA Takumi	a51d6ea990	llvm/test/Transforms/FunctionImport/funcimport.ll: -stats REQUIRES +Asserts. llvm-svn: 264561	2016-03-28 02:14:49 +00:00
Duncan P. N. Exon Smith	6565a0d4b2	Reapply ~"Bitcode: Collect all MDString records into a single blob" Spiritually reapply commit r264409 (reverted in r264410), albeit with a bit of a redesign. Firstly, avoid splitting the big blob into multiple chunks of strings. r264409 imposed an arbitrary limit to avoid a massive allocation on the shared 'Record' SmallVector. The bug with that commit only reproduced when there were more than "chunk-size" strings. A test for this would have been useless long-term, since we're liable to adjust the chunk-size in the future. Thus, eliminate the motivation for chunk-ing by storing the string sizes in the blob. Here's the layout: vbr6: # of strings vbr6: offset-to-blob blob: [vbr6]: string lengths [char]: concatenated strings Secondly, make the output of llvm-bcanalyzer readable. I noticed when debugging r264409 that llvm-bcanalyzer was outputting a massive blob all in one line. Past a small number, the strings were impossible to split in my head, and the lines were way too long. This version adds support in llvm-bcanalyzer for pretty-printing. <STRINGS abbrevid=4 op0=3 op1=9/> num-strings = 3 { 'abc' 'def' 'ghi' } From the original commit: Inspired by Mehdi's similar patch, http://reviews.llvm.org/D18342, this should (a) slightly reduce bitcode size, since there is less record overhead, and (b) greatly improve reading speed, since blobs are super cheap to deserialize. llvm-svn: 264551	2016-03-27 23:17:54 +00:00
Duncan P. N. Exon Smith	456c9968e5	Support: Implement StreamingMemoryObject::getPointer The implementation is fairly obvious. This is preparation for using some blobs in bitcode. For clarity (and perhaps future-proofing?), I moved the call to JumpToBit in BitstreamCursor::readRecord ahead of calling MemoryObject::getPointer, since JumpToBit can theoretically (a) read bytes, which (b) invalidates the blob pointer. This isn't strictly necessary the two memory objects we have: - The return of RawMemoryObject::getPointer is valid until the memory object is destroyed. - StreamingMemoryObject::getPointer is valid until the next chunk is read from the stream. Since the JumpToBit call is only going ahead to a word boundary, we'll never load another chunk. However, reordering makes it clear by inspection that the blob returned by BitstreamCursor::readRecord will be valid. I added some tests for StreamingMemoryObject::getPointer and BitstreamCursor::readRecord. llvm-svn: 264549	2016-03-27 23:00:59 +00:00
Teresa Johnson	569af59b14	Use DAG check to try to appease bot Try to appease http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/34772. This was the only check that didn't use DAG and it wasn't found. llvm-svn: 264538	2016-03-27 15:36:43 +00:00
Teresa Johnson	d29478f70e	[ThinLTO] Add optional import message and statistics Summary: Add a statistic to count the number of imported functions. Also, add a new -print-imports option to emit a trace of imported functions, that works even for an NDEBUG build. Note that emitOptimizationRemark does not work for the above printing as it expects a Function object and DebugLoc, neither of which we have with summary-based importing. This is part 2 of D18487, the first part was committed separately as r264536. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18487 llvm-svn: 264537	2016-03-27 15:27:30 +00:00
Hal Finkel	0b37175ca6	[PowerPC] Map max/minnum intrinsics and fmax/fmin to ISD nodes for CTR-based loop legality Intrinsic::maxnum and Intrinsic::minnum, along with the associated libc function calls (fmax[f], etc.) generally map to function calls after lowering. For some vector types with QPX at least, however, we can legally lower these, and we don't need to prohibit CTR-based loops on their account. It turned out, however, that the logic that checked the opcodes associated with intrinsics was broken (it would set the Opcode variable, but that variable was later checked only if set for some otherwise-external function call. This fixes the latter problem and adds the FMAX/MINNUM mappings. llvm-svn: 264532	2016-03-27 05:40:56 +00:00
Michael Kruse	ff379b69b2	[Verifier] Reject PHIs using defs from own block. Reject the following IR as malformed (assuming that %entry, %next are not in a loop): next: %y = phi i32 [ 0, %entry ] %x = phi i32 [ %y, %entry ] Such PHI nodes came up in PR26718. While there was no consensus on whether or not this is valid IR, most opinions on that bug and in a discussion on the llvm-dev mailing list tended towards a "strict interpretation" (term by Joseph Tremoulet) of PHI node uses. Also, the language reference explicitly states that "the use of each incoming value is deemed to occur on the edge from the corresponding predecessor block to the current block" and `DominatorTree::dominates(Instruction*, Use&)` uses this definition as well. For the code mentioned in PR15384, clang does not compile to such PHIs (anymore?). The test case still hangs when replacing `%tmp6` with `%tmp` in revisions before r176366 (where PR15384 has been fixed). The occurrence of %tmp6 therefore was probably unintentional. Its value is not used except in other PHIs. Reviewers: majnemer, reames, JosephTremoulet, bkramer, grosser, jdoerfert, kparzysz, sanjoy Differential Revision: http://reviews.llvm.org/D18443 llvm-svn: 264528	2016-03-26 23:32:57 +00:00
Sanjay Patel	796db35f62	[SimplifyCFG] propagate branch metadata when creating select (PR26636) llvm-svn: 264527	2016-03-26 23:30:50 +00:00
Sanjay Patel	342f7c7e10	minimize test cases These are tests for store transforms. The loads, adds, and geps were irrelevant. llvm-svn: 264526	2016-03-26 23:09:25 +00:00
David Blaikie	4dd03f0e12	llvm-dwp: Include the dwo name (if available) when diagnosing duplicate CU IDs from dwp input files If you're building dwps from other dwps, it can be hard to track down a duplicate CU ID if it comes from two compilations of the same file in different modes, etc. By including the .dwo path (which is hopefully more unique than the file path) it can help track down where the duplicates came from. llvm-svn: 264520	2016-03-26 20:32:14 +00:00
Simon Pilgrim	dcdf85033c	[X86][AVX] Enabled SMUL_LOHI/UMUL_LOHI v8i32 vectors on AVX1 targets Correct splitting of v8i32 vectors into v4i32 vectors to prevent scalarization llvm-svn: 264517	2016-03-26 18:32:13 +00:00
Simon Pilgrim	e4dbeb40c6	[X86][AVX] Enabled MULHS/MULHU v16i16 vectors on AVX1 targets Correct splitting of v16i16 vectors into v8i16 vectors to prevent scalarization Differential Revision: http://reviews.llvm.org/D18307 llvm-svn: 264512	2016-03-26 15:44:55 +00:00
Simon Pilgrim	3eef33a806	[X86][SSE] Add MULHS/MULHU custom lowering for i8 vectors Currently this is to mainly to prevent scalarization of integer division by constants. Differential Revision: http://reviews.llvm.org/D18307 llvm-svn: 264511	2016-03-26 15:27:20 +00:00
Simon Pilgrim	7b36cdaecf	[X86][SSE] Added v64i8 vector integer multiply tests llvm-svn: 264510	2016-03-26 09:50:06 +00:00
Simon Pilgrim	7379a70677	[X86][AVX512BW] AVX512BW can sign-extend v32i8 to v32i16 for simpler v32i8 multiplies. Only pre-AVX512BW targets need to split v32i8 vectors. llvm-svn: 264509	2016-03-26 09:44:27 +00:00
David Majnemer	b549ab02b4	[PowerPC] Disable the CTR optimization in the presence of {min,max}num The minnum and maxnum intrinsics get lowered to libcalls which invalidates the CTR optimization. This fixes PR27083. llvm-svn: 264508	2016-03-26 09:42:31 +00:00
Simon Pilgrim	9a5f19f509	[X86][SSE] Refreshed vector integer multiply tests Add all 256-bit vector tests. Added AVX512F/AVX512BW test targets. Renamed tests something more meaningful. llvm-svn: 264507	2016-03-26 09:35:48 +00:00
Chuang-Yu Cheng	065969ec8e	[Power9] Implement new altivec instructions: permute, count zero, extend sign, negate, parity, shift/rotate, mul10 This change implements the following vector operations: - vclzlsbb vctzlsbb vctzb vctzd vctzh vctzw - vextsb2w vextsh2w vextsb2d vextsh2d vextsw2d - vnegd vnegw - vprtybd vprtybq vprtybw - vbpermd vpermr - vrlwnm vrlwmi vrldnm vrldmi vslv vsrv - vmul10cuq vmul10uq vmul10ecuq vmul10euq 28 instructions Thanks Nemanja, Kit for invaluable hints and discussion! Reviewers: hal, nemanja, kbarton, tjablin, amehsan Phabricator: http://reviews.llvm.org/D15887 llvm-svn: 264504	2016-03-26 05:46:11 +00:00
Mehdi Amini	01e321306b	ThinLTO: use the callgraph from the combined index to drive the FunctionImporter Summary: Now that the summary contains the full reference/call graph, we can replace the existing function importer that loads and inspect the IR to iteratively walk the call graph by a traversal based purely on the summary information. Decouple the actual importing decision from any IR manipulation. Reviewers: tejohnson Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18343 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264503	2016-03-26 05:40:34 +00:00
Richard Smith	1fd6d1fd36	Stop testing the unspecified order in which the OnDiskHashTable stores entries. llvm-svn: 264487	2016-03-26 02:02:59 +00:00
Philip Reames	b5681138e4	Allow value forwarding past release fences in GVN A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-load forwarding across a release fence. We choose to be much more conservative about stores. In theory, nothing prevents us from shifting a store from after a release fence to before it, and then eliminating the preceeding (previously fenced) store. Doing this without actually moving the second store is likely also legal, but we chose to be conservative at this time. The LangRef indicates only atomic loads and stores are effected by fences. This patch chooses to be far more conservative then that. This is the GVN companion to http://reviews.llvm.org/D11434 which applied the same logic in EarlyCSE and has been baking in tree for a while now. Differential Revision: http://reviews.llvm.org/D11436 llvm-svn: 264472	2016-03-25 22:40:35 +00:00
David Majnemer	020e890a19	[X86] Emit a proper ADJCALLSTACKDOWN in EmitLoweredTLSAddr We forgot to add the second machine operand to our ADJCALLSTACKDOWN, resulting in crashes in PEI. This fixes PR27071. llvm-svn: 264465	2016-03-25 21:49:11 +00:00
Jun Bum Lim	36c53fe147	[MachineCopyPropagation] Expose more dead copies across instructions with regmasks When encountering instructions with regmasks, instead of cleaning up all the elements in MaybeDeadCopies map, remove only the instructions erased. By keeping more instruction in MaybeDeadCopies, this change will expose more dead copies across instructions with regmasks. llvm-svn: 264462	2016-03-25 21:15:35 +00:00
Nirav Dave	fa250cad37	Prevent construction of cycle in DAG store merge When merging stores in DAGCombiner, add check to ensure that no dependenices exist that would cause the construction of a cycle in our DAG. This may happen if one store has a data dependence on another instruction (e.g. a load) which itself has a (chain) dependence on another store being merged. These stores cannot be merged safely and doing so results in a cycle that is discovered in LegalizeDAG. This test is only done in cases where Antialias analysis is used (UseAA) as non-AA store merge candidates will be merged logically after all loads which have been checked to not alias. Reviewers: ahatanak, spatel, niravd, arsenm, hfinkel, tstellarAMD, jyknight Subscribers: llvm-commits, tberghammer, danalbert, srhines Differential Revision: http://reviews.llvm.org/D18336 llvm-svn: 264461	2016-03-25 21:06:30 +00:00
Sanjay Patel	69632447b1	[InstSimplify] regenerate checks using a script I didn't notice any significant changes in the actual checks here; all of these tests already used FileCheck, so a script can batch update them in one shot. This commit is just to show the value of automating this process: We have uniform formatting as opposed to a mish-mash of check structure that changes based on individual prefs and the current fashion. This makes it simpler to update when we find a bug or make an enhancement. llvm-svn: 264457	2016-03-25 20:12:25 +00:00
Sanjoy Das	d4c783335b	[RS4GC] Lower calls to @llvm.experimental.deoptimize This changes RS4GC to lower calls to ``@llvm.experimental.deoptimize`` to gc.statepoints wrapping ``__llvm_deoptimize``, and changes ``callsGCLeafFunction`` to recognize ``@llvm.experimental.deoptimize`` as a non GC leaf function. I've had to hard code the ``"__llvm_deoptimize"`` name in RewriteStatepointsForGC; since ``TargetLibraryInfo`` is available only during codegen. This isn't without precedent in the codebase, so I'm not overtly concerned. llvm-svn: 264456	2016-03-25 20:12:13 +00:00
Saleem Abdulrasool	750a90df6a	ARM: maintain BB ordering when expanding WIN__DBZCHK It is possible to have a fallthrough MBB prior to MBB placement. The original addition of the BB would result in reordering the BB as not preceding the successor. Because of the fallthrough nature of the BB, we could end up executing incorrect code or even a constant pool island! Insert the spliced BB into the same location to avoid that. Thanks to Tim Northover for invaluable hints and Fiora for the discussion on what may have been occurring! llvm-svn: 264454	2016-03-25 19:48:06 +00:00
Hans Wennborg	5f916d3df4	[X86] Use "and $0" and "orl $-1" to store 0 and -1 when optimizing for minsize 64-bit, 32-bit and 16-bit move-immediate instructions are 7, 6, and 5 bytes, respectively, whereas and/or with 8-bit immediate is only three bytes. Since these instructions imply an additional memory read (which the CPU could elide, but we don't think it does), restrict these patterns to minsize functions. Differential Revision: http://reviews.llvm.org/D18374 llvm-svn: 264440	2016-03-25 18:11:31 +00:00
Sanjay Patel	cd7d3ae7cc	[InstCombine] use FileCheck for better checking (testing script for autogeneration of check lines) llvm-svn: 264438	2016-03-25 18:03:40 +00:00
Sanjay Patel	d3d1179463	[InstCombine] use FileCheck for better checking (testing script for autogeneration of check lines) llvm-svn: 264437	2016-03-25 18:03:17 +00:00
Sanjay Patel	721fec09b5	[InstCombine] use FileCheck for better checking (testing script for autogeneration of check lines) llvm-svn: 264435	2016-03-25 18:03:01 +00:00
Sanjay Patel	1395cf0d3c	[InstCombine] use FileCheck for better checking (testing script for autogeneration of check lines) llvm-svn: 264434	2016-03-25 18:02:14 +00:00
Sanjay Patel	bfbac177d2	[InstCombine] use FileCheck for better checking (testing script for autogeneration of check lines) llvm-svn: 264433	2016-03-25 18:01:55 +00:00
Sanjay Patel	08da4b7cd8	[InstCombine] use FileCheck for better checking (testing script for autogeneration of check lines) llvm-svn: 264432	2016-03-25 18:01:37 +00:00
Sanjay Patel	8f22390137	[InstCombine] use FileCheck for better checking (testing script for autogeneration of check lines) llvm-svn: 264431	2016-03-25 18:01:23 +00:00
Sanjay Patel	5270746978	[InstCombine] use FileCheck for better checking (testing script for autogeneration of check lines) llvm-svn: 264430	2016-03-25 18:01:04 +00:00
Reid Kleckner	f6f04f8fc8	Consider regmasks when computing register-based DBG_VALUE live ranges Now register parameters that aren't saved to the stack or CSRs are considered dead after the first call. Previously the debugger would show whatever was in the register. Fixes PR26589 Reviewers: aprantl Differential Revision: http://reviews.llvm.org/D17211 llvm-svn: 264429	2016-03-25 17:54:46 +00:00
Sanjay Patel	246e7f7057	[InstCombine] consolidate regression tests of the ancients (2002) Testing out the check-generator-script that's now in the utils folder. llvm-svn: 264424	2016-03-25 17:16:32 +00:00
Adrian Prantl	5979790e42	Document the purpose of this testcase. llvm-svn: 264421	2016-03-25 16:49:57 +00:00
Hemant Kulkarni	966b3ac502	[llvm-readobj] Impl GNU style program headers print readelf -lW Differential Revision: http://reviews.llvm.org/D18372 llvm-svn: 264415	2016-03-25 16:04:48 +00:00
Duncan P. N. Exon Smith	fc8110041f	Revert "Bitcode: Collect all MDString records into a single blob" This reverts commit r264409 since it failed to bootstrap: http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto_build/8302/ llvm-svn: 264410	2016-03-25 15:22:27 +00:00
Duncan P. N. Exon Smith	fdbf0a5af8	Bitcode: Collect all MDString records into a single blob Optimize output of MDStrings in bitcode. This emits them in big blocks (currently 1024) in a pair of records: - BULK_STRING_SIZES: the sizes of the strings in the block, and - BULK_STRING_DATA: a single blob, which is the concatenation of all the strings. Inspired by Mehdi's similar patch, http://reviews.llvm.org/D18342, this should (a) slightly reduce bitcode size, since there is less record overhead, and (b) greatly improve reading speed, since blobs are super cheap to deserialize. I needed to add support for blobs to streaming input to get the test suite passing. - StreamingMemoryObject::getPointer reads ahead and returns the address of the blob. - To avoid a possible reallocation of StreamingMemoryObject::Bytes, BitstreamCursor::readRecord needs to move the call to JumpToEnd forward so that getPointer is the last bitstream operation. llvm-svn: 264409	2016-03-25 14:40:18 +00:00
David L Kreitzer	8d441eb936	Enable non-power-of-2 #pragma unroll counts. Patch by Evgeny Stupachenko. Differential Revision: http://reviews.llvm.org/D18202 llvm-svn: 264407	2016-03-25 14:24:52 +00:00
Matt Arsenault	8c8fcb2585	AMDGPU: Cost model for basic integer operations This resolves bug 21148 by preventing promotion to i64 induction variables. llvm-svn: 264376	2016-03-25 01:16:40 +00:00
Hans Wennborg	4ae5119eeb	X86: Use push-pop for materializing 8-bit immediates for minsize (take 2) This is the same as r255936, with added logic for avoiding clobbering of the red zone (PR26023). Differential Revision: http://reviews.llvm.org/D18246 llvm-svn: 264375	2016-03-25 01:10:56 +00:00
Matt Arsenault	9651813ee0	AMDGPU: Partially implement getArithmeticInstrCost for FP ops llvm-svn: 264374	2016-03-25 01:00:32 +00:00
Duncan P. N. Exon Smith	efe16c8eb4	IR: Stop upgrading !llvm.loop attachments via MDString Remove logic to upgrade !llvm.loop by changing the MDString tag directly. This old logic would check (and change) arbitrary strings that had nothing to do with loop metadata. Instead, check !llvm.loop attachments directly, and change which strings get attached. Rather than updating the assembly-based upgrade, drop it entirely. It has been quite a while since we supported upgrading textual IR. llvm-svn: 264373	2016-03-25 00:56:13 +00:00
Saleem Abdulrasool	0dab98d926	ARM: fix optimised division on WoA We did not have an explicit branch to the continuation BB. When the check was hoisted, this could permit control follow to fall through into the division trap. Add the explicit branch to the continuation basic block to ensure that code execution is correct. llvm-svn: 264370	2016-03-25 00:34:11 +00:00
Matt Arsenault	51d702812d	TTI: Report 0 cost for free addrspacecasts llvm-svn: 264369	2016-03-25 00:26:29 +00:00
Matt Arsenault	8e9aa0acc8	TTI: Use 0 for cost of fabs if free Ideally this would also happen for fneg, but that isn't a distinct operation in the IR. llvm-svn: 264368	2016-03-25 00:26:22 +00:00
Matt Arsenault	59767cea79	AMDGPU: TTI: Make insertelement free. We don't want to have a cost to scalarizing operations. llvm-svn: 264364	2016-03-25 00:14:11 +00:00
Manman Ren	9dd8c14674	CXX TLS: collect return blocks after SelectAllBasicBlocks. It is incorrect to get the corresponding MBB for a ReturnInst before SelectAllBasicBlocks since SelectAllBasicBlocks can change the correspondence between a ReturnInst and the MBB it is in. PR27062 llvm-svn: 264358	2016-03-24 23:21:29 +00:00
Sanjoy Das	731c67fed2	Lower varargs correctly in deopt bundle lowering Earlier we were ignoring varargs in LowerCallSiteWithDeoptBundle because populateCallLoweringInfo does not set CallLoweringInfo::IsVarArg. llvm-svn: 264354	2016-03-24 22:37:52 +00:00
David Blaikie	ce7c6cfe0e	llvm-dwp: Coalesce code for reading the CU's DW_AT_GNU_dwo_id and DW_AT_name Going to be reading the DW_AT_GNU_dwo_name shortly as well, and there was already enough duplication here that it was worth refactoring rather than adding even more. llvm-svn: 264350	2016-03-24 22:17:08 +00:00
Mike Aizatsky	6b510a4c01	[sancov] renaming statistics fields. llvm-svn: 264349	2016-03-24 21:49:55 +00:00
Matthias Braun	ae81c29352	LiveInterval: Fix Distribute() failing on liveranges with unused VNInfos This fixes http://llvm.org/PR26991 llvm-svn: 264345	2016-03-24 21:41:38 +00:00
David Majnemer	e09d035dad	[LoopStrengthReduce] Don't hoist into a catchswitch We try to hoist the insertion point as high as possible to encourage sharing. However, we must be careful not to hoist into a catchswitch as it is both an EHPad and a terminator. llvm-svn: 264344	2016-03-24 21:40:22 +00:00
Eric Christopher	b979d51afa	Finish the incomplete 'd' inline asm constraint support for PPC by making sure we give it a register and mark it as a register constraint. llvm-svn: 264340	2016-03-24 21:04:52 +00:00
Eric Christopher	8c95d53d45	Reorder check lines, comments in test and remove unnecessary IR. llvm-svn: 264339	2016-03-24 21:04:47 +00:00
Sanjoy Das	6bcfe31820	Match call and target calling conventions in test Fixes an issue in rL264329. llvm-svn: 264337	2016-03-24 20:51:24 +00:00
Reid Kleckner	01bc66a8ce	Revert "Recommitted r263424 "Supporting all entities declared in lexical scope in LLVM debug info." After fixing PR26942 (the fix is included in this commit)." This reverts commit r264280. This broke building Chromium for iOS. We'll upload a reproducer to the PR soon. llvm-svn: 264334	2016-03-24 20:38:49 +00:00
Sanjoy Das	df9ae70f49	Add lowering support for llvm.experimental.deoptimize Summary: Only adds support for "naked" calls to llvm.experimental.deoptimize. Support for round-tripping through RewriteStatepointsForGC will come as a separate patch (should be simpler than this one). Reviewers: reames Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18429 llvm-svn: 264329	2016-03-24 20:23:29 +00:00
Krzysztof Parzyszek	c9d4caa32c	[Hexagon] Add support for run-time stack overflow checking Patch by Sundeep Kushwaha. llvm-svn: 264328	2016-03-24 20:20:07 +00:00
Krzysztof Parzyszek	181fdbd174	[Hexagon] Generate PIC-specific versions of save/restore routines In PIC mode, the registers R14, R15 and R28 are reserved for use by the PLT handling code. This causes all functions to clobber these registers. While this is not new for regular function calls, it does also apply to save/restore functions, which do not follow the standard ABI conventions with respect to the volatile/non-volatile registers. Patch by Jyotsna Verma. llvm-svn: 264324	2016-03-24 19:18:48 +00:00
Sanjoy Das	c0c59fe14e	[Statepoints] Fix yet another issue around gc pointer uniqueing Given that StatepointLowering now uniques derived pointers before putting them in the per-statepoint spill map, we may end up with missing entries for derived pointers when we visit a gc.relocate on a pointer that was de-duplicated away. Fix this by keeping two maps, one mapping gc pointers to their de-duplicated values, and one mapping a de-duplicated value to the slot it is spilled in. llvm-svn: 264320	2016-03-24 18:57:39 +00:00
David Blaikie	0b214e4a2a	[debuginfo] Include dwo_name in the split unit to improve dwp diagnostics When multiple DWP files are merged together and duplicate DWO IDs are found it's currently difficult to give an actionable error message - the DW_AT_name of the CU could be provided, but might be identical (if the same source file is built into two different configurations), which doesn't help the user identify the problem. When no intermediate DWP files are generated, the path to the two DWO files could be provided - but is lost once the DWOs are merged into a DWP. So, include the name of the DWO (dwo_name) in the split file so that collissions involving a source CU from a DWP can be better diagnosed. (improvements to llvm-dwp using this to come shortly) llvm-svn: 264316	2016-03-24 18:37:08 +00:00
Adam Nemet	7aba60c853	[LLE] Check for mismatching types between the store and the load earlier isDependenceDistanceOfOne asserts that the store and the load access through the same type. This function is also used by removeDependencesFromMultipleStores so we need to make sure we filter out mismatching types before reaching this point. Now we do this when the initial candidates are gathered. This is a refinement of the fix made in r262267. Fixes PR27048. llvm-svn: 264313	2016-03-24 17:59:26 +00:00
Simon Atanasyan	26fe92d19f	[MC][mips] Add MipsMCInstrAnalysis class and register it as MC instruction analyzer The `MipsMCInstrAnalysis` class overrides the `evaluateBranch` method and calculates target addresses for branch and calls instructions. That allows llvm-objdump to print functions' names in branch instructions in the disassemble mode. Differential Revision: http://reviews.llvm.org/D18209 llvm-svn: 264309	2016-03-24 17:18:14 +00:00
Sanjoy Das	8f42b7b3cd	Remove unnecessary redirect from test llvm-svn: 264308	2016-03-24 17:18:00 +00:00
Rafael Espindola	fe26864440	Fix gold tests for llvm-readobj format change. llvm-svn: 264306	2016-03-24 16:45:41 +00:00
Simon Atanasyan	b7807a0c8e	[llvm-readobj] Decode st_other symbol's flags The patch supports common STV_xxx visibility flags and MIPS specific STO_MIPS_xxx flags. Differential Revision: http://reviews.llvm.org/D18447 llvm-svn: 264300	2016-03-24 16:10:37 +00:00
Elena Demikhovsky	95f3173ce9	AVX-512: Generate KTEST instead of TEST fir i1 vectors KTEST instruction may be used instead of TEST in this case: %int_sel3 = bitcast <8 x i1> %sel3 to i8 %res = icmp eq i8 %int_sel3, zeroinitializer br i1 %res, label %L2, label %L1 Differential Revision: http://reviews.llvm.org/D18444 llvm-svn: 264298	2016-03-24 15:53:45 +00:00
Tim Northover	4498eff9bb	CodeGen: extend RHS when splitting ATOMIC_CMP_SWAP_WITH_SUCCESS. If the operation's type has been promoted during type legalization, we need to account for the fact that the high bits of the comparison operand are likely unspecified. The LHS is usually zero-extended, but MIPS sign extends it, so we have to be slightly careful. Patch by Simon Dardis. llvm-svn: 264296	2016-03-24 15:38:38 +00:00
Rafael Espindola	e1c42ac12b	Fix another case where we were unconditionally linking linkonce GVs. With this I think that now llvm-link, lld and the gold plugin should agree on which symbol is kept. llvm-svn: 264292	2016-03-24 15:23:01 +00:00
Rafael Espindola	42e0323768	Fix resolution of linkonce symbols in comdats. After comdat processing, the symbols still go through regular symbol resolution. We were not doing it for linkonce symbols since they are lazy linked. This fixes pr27044. llvm-svn: 264288	2016-03-24 14:58:44 +00:00
Daniel Sanders	15f8fb6f83	[mips] Range check vsplat_simm5 and vsplat_simm10 Summary: Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D18177 llvm-svn: 264287	2016-03-24 14:53:40 +00:00
Pirama Arumuga Nainar	dc45aef2d8	Remove unsafe AssertZext after promoting result of FP_TO_FP16 Summary: Some target lowerings of FP_TO_FP16, for instance ARM's vcvtb.f16.f32 instruction, do not guarantee that the top 16 bits are zeroed out. Remove the unsafe AssertZext and add tests to exercise this. Reviewers: jmolloy, sbaranga, kristof.beyls, aadg Subscribers: llvm-commits, srhines, aemerson Differential Revision: http://reviews.llvm.org/D18426 llvm-svn: 264285	2016-03-24 14:06:03 +00:00
Nemanja Ivanovic	5ebc92dbe1	[PowerPC] Disable direct moves for extractelement and bitcast in 32-bit mode This patch corresponds to review: http://reviews.llvm.org/D17711 It disables direct moves on these operations in 32-bit mode since the patterns assume 64-bit registers. The final patch is slightly different from the Phabricator review as the bitcast operations needed to be disabled in 32-bit mode as well. This fixes PR26617. llvm-svn: 264282	2016-03-24 13:40:33 +00:00
Amjad Aboud	6ff7e10052	Recommitted r263424 "Supporting all entities declared in lexical scope in LLVM debug info." After fixing PR26942 (the fix is included in this commit). Differential Revision: http://reviews.llvm.org/D18350 llvm-svn: 264280	2016-03-24 13:30:16 +00:00
Daniel Sanders	837f15187b	[mips] Range check simm10 Summary: Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D18148 llvm-svn: 264279	2016-03-24 13:26:59 +00:00
Simon Pilgrim	572ca71573	[X86][XOP] Support for VPPERM byte shuffle instruction This patch begins adding support for lowering to the XOP VPPERM instruction - adding the X86ISD::VPPERM opcode. Differential Revision: http://reviews.llvm.org/D18189 llvm-svn: 264260	2016-03-24 11:52:43 +00:00
Zlatko Buljan	94af4cbcf4	[mips][microMIPS] Add CodeGen support for DIV, MOD, DIVU, MODU, DDIV, DMOD, DDIVU and DMODU instructions Differential Revision: http://reviews.llvm.org/D17137 llvm-svn: 264248	2016-03-24 09:22:45 +00:00
James Molloy	ee675880b8	[llvm-nm] Correct -P ELF output Correctly add a space between the address and size when outputting in posix mode (-P). llvm-svn: 264247	2016-03-24 09:18:09 +00:00
Hrvoje Varga	2cb74ac3c3	[mips][microMIPS] Implement MTC, MTHC and DMTC* instructions Differential Revision: http://reviews.llvm.org/D17328 llvm-svn: 264246	2016-03-24 08:02:09 +00:00
Hrvoje Varga	dbea1a1e51	[mips][microMIPS] Fix for "Cannot copy registers" assertion Differential Revision: http://reviews.llvm.org/D17068 llvm-svn: 264245	2016-03-24 06:05:35 +00:00
Adam Nemet	279784ffc4	[LAA] Support memchecks involving loop-invariant addresses We used to only allow SCEVAddRecExpr for pointer expressions in order to be able to compute the bounds. However this is also trivially possible for loop-invariant addresses (scUnknown) since then the bounds are the address itself. Interestingly, we used allow this for the special case when the loop-invariant address happens to also be an SCEVAddRecExpr (in an outer loop). There are a couple more loops that are vectorized in SPEC after this. My guess is that the main reason we don't see more because for example a loop-invariant load is vectorized into a splat vector with several vector-inserts. This is likely to make the vectorization unprofitable. I.e. we don't notice that a later LICM will move all of this out of the loop so the cost estimate should really be 0. llvm-svn: 264243	2016-03-24 04:28:47 +00:00
Simon Pilgrim	0110890a79	[X86][SSE] Added tests to ensure that consecutive loads including any/all volatiles are not combined llvm-svn: 264225	2016-03-24 00:14:37 +00:00
Paul Robinson	f81836bd18	[PS4] Guarantee an instruction after a 'noreturn' call. We need the "return address" of a noreturn call to be within the bounds of the calling function; TrapUnreachable turns 'unreachable' into a 'ud2' instruction, which has that desired effect. Differential Revision: http://reviews.llvm.org/D18414 llvm-svn: 264224	2016-03-24 00:10:03 +00:00
Rafael Espindola	1ee9fbd842	Fix lazy linking of comdat members. If not for lazy linking of linkonce GVs, comdats are just a preprocessing before symbol resolution. Lazy linking complicates it since when we pick a visible member of comdat, we have to make sure the rest of it passes symbol resolution too. llvm-svn: 264223	2016-03-24 00:06:03 +00:00
Mike Aizatsky	5c79bb364a	[sancov] -print-coverage-stats option to print various coverage statistics. Differential Revision: http://reviews.llvm.org/D18418 llvm-svn: 264222	2016-03-24 00:00:08 +00:00
Matt Arsenault	30d37a74da	AMDGPU: Remove atomic inc/dec patterns There is no benefit to these since materializing the constant 1 requires the same number of instructions as materializing uint_max llvm-svn: 264215	2016-03-23 23:23:38 +00:00
Matt Arsenault	0a30e456b4	AMDGPU: Promote alloca should skip volatiles llvm-svn: 264214	2016-03-23 23:17:29 +00:00
Matt Arsenault	f43c2a0b49	AMDGPU: Insert moves of frame index to value operands Strengthen tests of storing frame indices. Right now this just creates irrelevant scheduling changes. We don't want to have multiple frame index operands on an instruction. There seem to be various assumptions that at least the same frame index will not appear twice in the LocalStackSlotAllocation pass. There's no reason to have this happen, and it just makes it easy to introduce bugs where the immediate offset is appplied to the storing instruction when it should really be applied to the value being stored as a separate add. This might not be sufficient. It might still be problematic to have an add fi, fi situation, but that's even less unlikely to happen in real code. llvm-svn: 264200	2016-03-23 21:49:25 +00:00
Cong Hou	94710840fb	Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed. Currently, AnalyzeBranch() fails non-equality comparison between floating points on X86 (see https://llvm.org/bugs/show_bug.cgi?id=23875). This is because this function can modify the branch by reversing the conditional jump and removing unconditional jump if there is a proper fall-through. However, in the case of non-equality comparison between floating points, this can turn the branch "unanalyzable". Consider the following case: jne.BB1 jp.BB1 jmp.BB2 .BB1: ... .BB2: ... AnalyzeBranch() will reverse "jp .BB1" to "jnp .BB2" and then "jmp .BB2" will be removed: jne.BB1 jnp.BB2 .BB1: ... .BB2: ... However, AnalyzeBranch() cannot analyze this branch anymore as there are two conditional jumps with different targets. This may disable some optimizations like block-placement: in this case the fall-through behavior is enforced even if the fall-through block is very cold, which is suboptimal. Actually this optimization is also done in block-placement pass, which means we can remove this optimization from AnalyzeBranch(). However, currently X86::COND_NE_OR_P and X86::COND_NP_OR_E are not reversible: there is no defined negation conditions for them. In order to reverse them, this patch defines two new CondCode X86::COND_E_AND_NP and X86::COND_P_AND_NE. It also defines how to synthesize instructions for them. Here only the second conditional jump is reversed. This is valid as we only need them to do this "unconditional jump removal" optimization. Differential Revision: http://reviews.llvm.org/D11393 llvm-svn: 264199	2016-03-23 21:45:37 +00:00
Rafael Espindola	f2e71244c6	Fix logic for which symbols to keep with comdats. If a comdat is dropped, all symbols in it are dropped. If a comdat is kept, the symbols survive to pass regular symbol resolution. With this patch we do that for all global symbols. The added test is a copy of test/tools/gold/X86/comdat.ll that we now pass. llvm-svn: 264192	2016-03-23 21:16:33 +00:00
Kevin Enderby	5afbc1cda7	Fix a crash in running llvm-objdump -t with an invalid Mach-O file already in the test suite. While this is not really an interesting tool and option to run on a Mach-O file to show the symbol table in a generic libObject format it shouldn’t crash. The reason for the crash was in MachOObjectFile::getSymbolType() when it was calling MachOObjectFile::getSymbolSection() without checking its return value for the error case. What makes this fix require a fair bit of diffs is that the method getSymbolType() is in the class ObjectFile defined without an ErrorOr<> so I needed to add that all the sub classes. And all of the uses needed to be updated and the return value needed to be checked for the error case. The MachOObjectFile version of getSymbolType() “can” get an error in trying to come up with the libObject’s internal SymbolRef::Type when the Mach-O symbol symbol type is an N_SECT type because the code is trying to select from the SymbolRef::ST_Data or SymbolRef::ST_Function values for the SymbolRef::Type. And it needs the Mach-O section to use isData() and isBSS to determine if it will return SymbolRef::ST_Data. One other possible fix I considered is to simply return SymbolRef::ST_Other when MachOObjectFile::getSymbolSection() returned an error. But since in the past when I did such changes that “ate an error in the libObject code” I was asked instead to push the error out of the libObject code I chose not to implement the fix this way. As currently written both the COFF and ELF versions of getSymbolType() can’t get an error. But if isReservedSectionNumber() wanted to check for the two known negative values rather than allowing all negative values or the code wanted to add the same check as in getSymbolAddress() to use getSection() and check for the error then these versions of getSymbolType() could return errors. At the end of the day the error printed now is the generic “Invalid data was encountered while parsing the file” for object_error::parse_failed. In the future when we thread Lang’s new TypedError for recoverable error handling though libObject this will improve. And where the added // Diagnostic(… comment is, it would be changed to produce and error message like “bad section index (42) for symbol at index 8” for this case. llvm-svn: 264187	2016-03-23 20:27:00 +00:00
Kyle Butt	613112826e	Codegen: [PPC] Word Rotates are Zero Extending. Add Word rotates to the list of instructions that are zero extending. This allows them to be used in dot form to compare with zero. llvm-svn: 264183	2016-03-23 19:51:22 +00:00
George Burgess IV	0e4898685f	Fix bugs in the MemorySSA walker. There are a few bugs in the walker that this patch addresses. Primarily: - Caching can break when we have multiple BBs without phis - We weren't optimizing some phis properly - Because of how the DFS iterator works, there were times where we wouldn't cache any results of our DFS I left the test cases with FIXMEs in, because I'm not sure how much effort it will take to get those to work (read: We'll probably ultimately have to end up redoing the walker, or we'll have to come up with some creative caching tricks), and more test coverage = better. Differential Revision: http://reviews.llvm.org/D18065 llvm-svn: 264180	2016-03-23 18:31:55 +00:00
Simon Pilgrim	b5fb65d43e	[X86] Regenerated WidenArith test llvm-svn: 264157	2016-03-23 14:00:28 +00:00
Oliver Stannard	aa77b1e025	[AArch64] Replace some uses of report_fatal_error with reportError in AArch64 ELF object writer If we can't handle a relocation type, report it as an error in the source, rather than asserting. I've added a more descriptive message and a test for the only cases of this that I've been able to trigger. Differential Revision: http://reviews.llvm.org/D18388 llvm-svn: 264156	2016-03-23 13:45:03 +00:00
Andrey Turetskiy	6a3d561ea0	[X86] Introduction of FeatureX87. Add FeatureX87 in X86 backend to be able to define CPUs which doesn't have x87. Differential Revision: http://reviews.llvm.org/D13979 llvm-svn: 264148	2016-03-23 11:13:54 +00:00
Hrvoje Varga	c45baf212a	[mips][microMIPS] Delay slot filler modifications Differential Revision: http://reviews.llvm.org/D18181 llvm-svn: 264147	2016-03-23 10:29:38 +00:00
Valery Pykhtin	c0a77c5064	[AMDGPU] Fix missing assembler predicates. Differential Revision: http://reviews.llvm.org/D18351 llvm-svn: 264137	2016-03-23 04:27:26 +00:00
Sanjoy Das	e58ca59cf4	[StatepointLowering] Schedule gc relocates before uniqueing them Otherwise we can see an "unexpected" gc.relocate that we uniqued away. llvm-svn: 264127	2016-03-23 02:24:07 +00:00
Justin Lebar	e87e1c6cdd	[NVVM] Remove noduplicate attribute from synchronizing intrinsics. Summary: I've completed my audit of all the code that looks at noduplicate and added handling of convergent where appropriate, so we no longer need noduplicate on these intrinsics. Reviewers: jholewinski Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D18168 llvm-svn: 264107	2016-03-22 22:08:01 +00:00
Rafael Espindola	370d528a05	Drop comdats from the dst module if they are not selected. A really unfortunate design of llvm-link and related libraries is that they operate one module at a time. This means they can copy a GV to the destination module that should not be there in the final result because a later bitcode file takes precedence. We already handled cases like a strong GV replacing a weak for example. One case that is not currently handled is a comdat replacing another. This doesn't happen in ELF, but with COFF largest selection kind it is possible. In "llvm-link a.ll b.ll" if the selected comdat was from a.ll, everything will work and we will not copy the comdat from b.ll. But if we run "llvm-link b.ll a.ll", we fail to delete the already copied comdat from b.ll. This patch fixes that. llvm-svn: 264103	2016-03-22 21:35:47 +00:00
George Burgess IV	d4febd1612	Keep CodeGenPrepare from preserving the domtree. CGP modifies the domtree in some cases, so saying that it preserves the domtree is a lie. We'll be able to selectively preserve it with the new pass manager. Differential Revision: http://reviews.llvm.org/D16893 llvm-svn: 264099	2016-03-22 21:25:08 +00:00
Matthias Braun	68bb2931cc	Revert "Support arbitrary addrspace pointers in masked load/store intrinsics" This commit broke LTO builds. Reverting it to unbreak the bots while the issue is investigated. See also: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160321/341002.html This reverts r263158 llvm-svn: 264088	2016-03-22 20:24:34 +00:00
Simon Pilgrim	cc41495eb8	[X86][AVX] Added AVX1 tests for 256-bit vector idiv-by-constant Prep work based on feedback for D18307 llvm-svn: 264086	2016-03-22 20:10:49 +00:00
Simon Pilgrim	c6f5fe3d69	[SelectionDAG] Ensure constant folded legalized vector element types are compatible with the BUILD_VECTOR type Found during fuzz testing - 32-bit x86 targets were legalizing a <2 x i1> compare result to <2 x i32> when <2 x i64> was expected. llvm-svn: 264085	2016-03-22 19:59:53 +00:00
Tim Northover	b49a8a9dbb	CodeGen: check return types match when emitting tail call to builtin. We were just completely ignoring the types when determining whether we could safely emit a libcall as a tail call. This is clearly wrong. Theoretically, we could dig deeper looking for incidental matches (much like the generic code in Analysis.cpp does), but it's probably not worth it for the few libcalls that exist. llvm-svn: 264084	2016-03-22 19:14:38 +00:00
Sanjoy Das	bfecef5e1b	Remove unnecessary branch from test (Addresses post commit review by Reid Kleckner) llvm-svn: 264083	2016-03-22 18:45:41 +00:00
Adam Nemet	8b47e0d0ea	[LoopVersioning] Relax an assert for LCSSA PHIs When you have multiple LCSSA (single-operand) PHIs that are converted into two-operand PHIs due to versioning, only assert that the PHI currently being converted has a single operand. I.e. we don't want to check PHIs that were converted earlier in the loop. Fixes PR27023. Thanks to Karl-Johan Karlsson for the minimized testcase! llvm-svn: 264081	2016-03-22 18:38:15 +00:00
Sanjoy Das	eb5037cadc	Allow lowering call sites with both funclets and deopt state Lowering funclets is a no-op, so we can just go ahead and lower the deopt state. llvm-svn: 264078	2016-03-22 18:10:39 +00:00
Dan Gohman	665d7e3838	[WebAssembly] Implement the rotate instructions. llvm-svn: 264076	2016-03-22 18:01:49 +00:00
Simon Pilgrim	25fb4177fb	[X86][SSE] Reapplied: Simplify vector LOAD + EXTEND on pre-SSE41 hardware Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions. We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT. Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718). Reapplied with a fix for PR26953 (missing vector widening legalization). Differential Revision: http://reviews.llvm.org/D17932 llvm-svn: 264062	2016-03-22 16:22:08 +00:00
Daniel Sanders	97297770a6	[mips] Range check simm7. Summary: Also renamed li_simm7 to li16_imm since it's not a simm7 and has an unusual encoding (it's a uimm7 except that 0x7f represents -1). Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18145 llvm-svn: 264056	2016-03-22 14:40:00 +00:00
Daniel Sanders	0f17d0da4a	[mips] Range check simm5. Summary: We can't check the error message for this one because there's another lw/sw available that covers a larger range. We therefore check the transition between the two sizes. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D18144 llvm-svn: 264054	2016-03-22 14:29:53 +00:00
Daniel Sanders	946dee3b5b	[mips] Range check vsplat_uimm[1234568]. Summary: Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18143 llvm-svn: 264053	2016-03-22 14:17:41 +00:00
Daniel Sanders	93fa4ce9b7	[mips] Range check uimm4_ptr, remove uimm6_ptr, and use correctly sized immediates in MSA copy/insert. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18142 llvm-svn: 264052	2016-03-22 13:58:53 +00:00
Zinovy Nis	07ac2bd4d0	[PATCH] Force LoopReroll to reset the loop trip count value after reroll. It's a bug fix. For rerolled loops SE trip count remains unchanged. It leads to incorrect work of the next passes. My patch just resets SE info for rerolled loop forcing SE to re-evaluate it next time it requested. I also added a verifier call in the exisitng test to be sure no invalid SE data remain. Without my fix this test would fail with -verify-scev. Differential Revision: http://reviews.llvm.org/D18316 llvm-svn: 264051	2016-03-22 13:50:57 +00:00
Marina Yatsina	33ef7dad18	[ELF][gcc compatibility]: support section names with special characters (e.g. "/") Adding support for section names with special characters in them (e.g. "/"). GCC successfully compiles such section names. This also fixes PR24520. Differential Revision: http://reviews.llvm.org/D15678 llvm-svn: 264038	2016-03-22 11:23:15 +00:00
Sanjoy Das	dd1d72ce92	Appease the windows buildbots The guess is that the stdout/stderr ordering may differ between windows / unix. llvm-svn: 264019	2016-03-22 02:11:57 +00:00
Sanjoy Das	38bfc22161	Add "first class" lowering for deopt operand bundles Summary: After this change, deopt operand bundles can be lowered directly by SelectionDAG into STATEPOINT instructions (which are then lowered to a call or sequence of nop, with an associated __llvm_stackmaps entry0. This obviates the need to round-trip deoptimization state through gc.statepoint via RewriteStatepointsForGC. Reviewers: reames, atrick, majnemer, JosephTremoulet, pgavlin Subscribers: sanjoy, mcrosier, majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D18257 llvm-svn: 264015	2016-03-22 00:59:13 +00:00
George Burgess IV	3887a41725	[MemorySSA] Consider def-only BBs for live-in calculations. If we have a BB with only MemoryDefs, live-in calculations will ignore it. This means we get results like this: define void @foo(i8* %p) { ; 1 = MemoryDef(liveOnEntry) store i8 0, i8* %p br i1 undef, label %if.then, label %if.end if.then: ; 2 = MemoryDef(1) store i8 1, i8* %p br label %if.end if.end: ; 3 = MemoryDef(1) store i8 2, i8* %p ret void } ...When there should be a MemoryPhi in the `if.end` BB. This patch fixes that behavior. llvm-svn: 263991	2016-03-21 21:25:39 +00:00
Krzysztof Parzyszek	67e6ae5e2a	Remove leftover options from multiline.ll I added -march=hexagon to force using Hexagon target when testing locally, and I forgot to take it out. llvm-svn: 263990	2016-03-21 21:25:01 +00:00
Rafael Espindola	7ff714c339	Add a testcase that would have found the bug in r263971. llvm-svn: 263988	2016-03-21 21:09:38 +00:00
Rafael Espindola	9219fe79b9	Revert "[llvm-objdump] Printing relocations in executable and shared object files. This partially reverts r215844 by removing test objdump-reloc-shared.test which stated GNU objdump doesn't print relocations, it does." This reverts commit r263971. It produces the wrong results for .rela.dyn. I will add a test. llvm-svn: 263987	2016-03-21 20:59:15 +00:00
Krzysztof Parzyszek	738c6277a6	Unxfail test/DebugInfo/Generic/multiline.ll on Hexagon llvm-svn: 263986	2016-03-21 20:55:59 +00:00
Nicolai Haehnle	213e87f2ee	AMDGPU: Add SIWholeQuadMode pass Summary: Whole quad mode is already enabled for pixel shaders that compute derivatives, but it must be suspended for instructions that cause a shader to have side effects (i.e. stores and atomics). This pass addresses the issue by storing the real (initial) live mask in a register, masking EXEC before instructions that require exact execution and (re-)enabling WQM where required. This pass is run before register coalescing so that we can use machine SSA for analysis. The changes in this patch expose a problem with the second machine scheduling pass: target independent instructions like COPY implicitly use EXEC when they operate on VGPRs, but this fact is not encoded in the MIR. This can lead to miscompilation because instructions are moved past changes to EXEC. This patch fixes the problem by adding use-implicit operands to target independent instructions. Some general codegen passes are relaxed to work with such implicit use operands. Reviewers: arsenm, tstellarAMD, mareko Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18162 llvm-svn: 263982	2016-03-21 20:28:33 +00:00
Krzysztof Parzyszek	b14f4fd0de	[Hexagon] Add handling fixups and instruction relaxation llvm-svn: 263981	2016-03-21 20:27:17 +00:00
Krzysztof Parzyszek	c6f1e1a709	[Hexagon] Properly encode registers in duplex instructions llvm-svn: 263980	2016-03-21 20:13:33 +00:00
Dan Gohman	c8d7f14506	[WebAssembly] Implement the eqz instructions. llvm-svn: 263976	2016-03-21 19:54:41 +00:00
Colin LeMahieu	cdaf644c48	[llvm-objdump] Printing relocations in executable and shared object files. This partially reverts r215844 by removing test objdump-reloc-shared.test which stated GNU objdump doesn't print relocations, it does. In executable and shared object ELF files, relocations in the file contain the final virtual address rather than section offset so this is adjusted to display section offset. Differential revision: http://reviews.llvm.org/D15965 llvm-svn: 263971	2016-03-21 19:14:50 +00:00
Tom Stellard	92339e888f	AMDGPU/SI: Fix threshold calculation for branching when exec is zero Summary: When control flow is implemented using the exec mask, the compiler will insert branch instructions to skip over the masked section when exec is zero if the section contains more than a certain number of instructions. The previous code would only count instructions in successor blocks, and this patch modifies the code to start counting instructions in all blocks between the start and end of the branch. Reviewers: nhaehnle, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18282 llvm-svn: 263969	2016-03-21 18:56:58 +00:00
Matt Arsenault	cb38a6bd35	AMDGPU: Remove SignBitIsZero for mubuf scratch offsets These instructions do not have the same negative base address problem that DS instructions do on SI. llvm-svn: 263964	2016-03-21 18:02:18 +00:00
Peter Collingbourne	86b9fbe980	ARM: Better codegen for 64-bit compares. This introduces a custom lowering for ISD::SETCCE (introduced in r253572) that allows us to emit a short code sequence for 64-bit compares. Before: push {r7, lr} cmp r0, r2 mov.w r0, #0 mov.w r12, #0 it hs movhs r0, #1 cmp r1, r3 it ge movge.w r12, #1 it eq moveq r12, r0 cmp.w r12, #0 bne .LBB1_2 @ BB#1: @ %bb1 bl f pop {r7, pc} .LBB1_2: @ %bb2 bl g pop {r7, pc} After: push {r7, lr} subs r0, r0, r2 sbcs.w r0, r1, r3 bge .LBB1_2 @ BB#1: @ %bb1 bl f pop {r7, pc} .LBB1_2: @ %bb2 bl g pop {r7, pc} Saves around 80KB in Chromium's libchrome.so. Some notes on this patch: - I don't much like the ARMISD::BRCOND and ARMISD::CMOV combines I introduced (nothing else needs them). However, they are necessary in order to avoid poor codegen, and they seem similar to existing combines in other backends (e.g. X86 combines (brcond (cmp (setcc Compare))) to (brcond Compare)). - No support for Thumb-1. This is in principle possible, but we'd need to implement ARMISD::SUBE for Thumb-1. Differential Revision: http://reviews.llvm.org/D15256 llvm-svn: 263962	2016-03-21 18:00:02 +00:00
Renato Golin	2b6b7ffd6c	[ARM] Add Cortex-A32 support Adding Cortex-A32 as an available target in the ARM backend. Patch by Sam Parker. llvm-svn: 263956	2016-03-21 17:29:01 +00:00
Hemant Kulkarni	a11fbe1cb1	[llvm-readobj] Impl GNU style symbols printing Implements "readelf -sW and readelf -DsW" Differential Revision: http://reviews.llvm.org/D18224 llvm-svn: 263952	2016-03-21 17:18:23 +00:00
Matt Arsenault	b96b57347a	AMDGPU: Add frexp_mant intrinsic llvm-svn: 263948	2016-03-21 16:11:05 +00:00
Matt Arsenault	155dda9134	Implement constant folding for bitreverse llvm-svn: 263945	2016-03-21 15:00:35 +00:00
Silviu Baranga	f875e4fd92	[IndVars] Fix PR26974: make sure replaceCongruentIVs doesn't break LCSSA Summary: replaceCongruentIVs can break LCSSA when trying to replace IV increments since it tries to replace all uses of a phi node with another phi node while both of the phi nodes are not necessarily in the processed loop. This will cause an assert in IndVars. To fix this, we add a check to make sure that the replacement maintains LCSSA. Reviewers: sanjoy Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18266 llvm-svn: 263941	2016-03-21 12:44:29 +00:00
Silviu Baranga	46030585b3	[DAGCombine] Catch the case where extract_vector_elt can cause an any_ext while processing AND SDNodes Summary: extract_vector_elt can cause an implicit any_ext if the types don't match. When processing the following pattern: (and (extract_vector_elt (load ([non_ext\|any_ext\|zero_ext] V))), c) DAGCombine was ignoring the possible extend, and sometimes removing the AND even though it was required to maintain some of the bits in the result to 0, resulting in a miscompile. This change fixes the issue by limiting the transformation only to cases where the extract_vector_elt doesn't perform the implicit extend. Reviewers: t.p.northover, jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18247 llvm-svn: 263935	2016-03-21 11:43:46 +00:00
Elena Demikhovsky	39a0020f2d	Fixed -mcpu flag "core-avx" does not exist; I changed to "nehalem" llvm-svn: 263932	2016-03-21 11:06:20 +00:00
Simon Pilgrim	4af44f3c13	[X86][SSE] Add vector integer division by constant tests Expanded tests and split into sdiv/srem and udiv/urem cases for 128 and 256 bit vectors. llvm-svn: 263917	2016-03-20 21:46:58 +00:00
Jingyue Wu	1375560bdb	[NVPTX] Adds a new address space inference pass. Summary: The old address space inference pass (NVPTXFavorNonGenericAddrSpaces) is unable to convert the address space of a pointer induction variable. This patch adds a new pass called NVPTXInferAddressSpaces that overcomes that limitation using a fixed-point data-flow analysis (see the file header comments for details). The new pass is experimental and not enabled by default. Users can turn it on by setting the -nvptx-use-infer-addrspace flag of llc. Reviewers: jholewinski, tra, jlebar Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D17965 llvm-svn: 263916	2016-03-20 20:59:20 +00:00
Simon Pilgrim	c44472a5bc	[X86][SSE] Detect zeroable shuffle elements from different value types Improve computeZeroableShuffleElements to be able to peek through bitcasts to extract zero/undef values from BUILD_VECTOR nodes of different element sizes to the shuffle mask. Differential Revision: http://reviews.llvm.org/D14261 llvm-svn: 263906	2016-03-20 15:45:42 +00:00
Igor Breger	3ea8af5108	AVX512BW: Enable v32i1/v64i1 BUILD_VECTOR Differential Revision: http://reviews.llvm.org/D18211 llvm-svn: 263898	2016-03-20 13:09:43 +00:00
Mehdi Amini	43165d913a	Expose IRBuilder::CreateAtomicCmpXchg as LLVMBuildAtomicCmpXchg in the C API. Summary: Also expose getters and setters in the C API, so that the change can be tested. Reviewers: nhaehnle, axw, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18260 From: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> llvm-svn: 263886	2016-03-19 21:28:28 +00:00
David Majnemer	abae6b588b	[SimplifyLibCalls] Only consider sinpi/cospi functions within the same function The sinpi/cospi can be replaced with sincospi to remove unnecessary computations. However, we need to make sure that the calls are within the same function! This fixes PR26993. llvm-svn: 263875	2016-03-19 04:53:02 +00:00
David Majnemer	cdf2873e36	[InstCombine] Don't insert instructions before a catch switch CatchSwitches are not splittable, we cannot insert casts, etc. before them. This fixes PR26992. llvm-svn: 263874	2016-03-19 04:39:52 +00:00
Mehdi Amini	8d05185a26	Rework linkInModule(), making it oblivious to ThinLTO Summary: ThinLTO is relying on linkInModule to import selected function. However a lot of "magic" was hidden in linkInModule and the IRMover, who would rename and promote global variables on the fly. This is moving to an approach where the steps are decoupled and the client is reponsible to specify the list of globals to import. As a consequence some test are changed because they were relying on the previous behavior which was importing the definition of every single global without control on the client side. Now the burden is on the client to decide if a global has to be imported or not. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18122 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263863	2016-03-19 00:40:31 +00:00
Mehdi Amini	155da5b132	Add a test for r263577: "Add missing error handling in llvm-lto" On Rafael's suggestion! (also fix a discrepancy between this error message format and the others) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263860	2016-03-19 00:17:32 +00:00
Manman Ren	a3a019cf90	[CXX_FAST_TLS] Fix issues in ARM. We need to be careful on which registers can be explicitly handled via copies. Prologue, Epilogue use physical registers and if one belongs to the set of CSRsViaCopy, it will no longer be CSRed, since PEI overwrites it after the explicit copies. llvm-svn: 263857	2016-03-18 23:44:37 +00:00
Manman Ren	4865d89653	[CXX_FAST_TLS] Disable tail call when calling conventions are mismatched. Since CXX_FAST_TLS has a bigger set of CSRs, we don't tail call when caller and callee have mismatched calling conventions. llvm-svn: 263856	2016-03-18 23:41:51 +00:00
Manman Ren	2828c57b6f	[CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86. Since at O0, explicit copies via SplitCSR may not be removed even if they are unnecessary, we choose not to use SplitCSR at O0. llvm-svn: 263855	2016-03-18 23:38:49 +00:00
Michael Kuperstein	5abc2765fa	Have DataLayout::isLegalInteger() accept uint64_t While not strictly necessary, since we don't support large integer types, this avoids bugs due to silent truncation from uint64_t to a 32-bit unsigned (e.g. DL.isLegalInteger(DL.getTypeSizeInBits(Ty) ) This fixes PR26972. Differential Revision: http://reviews.llvm.org/D18258 llvm-svn: 263850	2016-03-18 23:19:29 +00:00
Alexei Starovoitov	7e453bb8be	BPF: emit an error message for unsupported signed division operation Signed-off-by: Yonghong Song <yhs@plumgrid.com> Signed-off-by: Alexei Starovoitov <ast@fb.com> llvm-svn: 263842	2016-03-18 22:02:47 +00:00
Sanjoy Das	60fb899f28	[IndVars] Pass the right loop to isLoopInvariantPredicate The loop on IVOperand's incoming values assumes IVOperand to be an induction variable on the loop over which `S Pred X` is invariant; otherwise loop invariant incoming values to IVOperand are not guaranteed to dominate the comparision. This fixes PR26973. llvm-svn: 263827	2016-03-18 20:37:07 +00:00
Chad Rosier	cdfd7e7201	[AArch64] Enable more load clustering in the MI Scheduler. This patch adds unscaled loads and sign-extend loads to the TII getMemOpBaseRegImmOfs API, which is used to control clustering in the MI scheduler. This is done to create more opportunities for load pairing. I've also added the scaled LDRSWui instruction, which was missing from the scaled instructions. Finally, I've added support in shouldClusterLoads for clustering adjacent sext and zext loads that too can be paired by the load/store optimizer. Differential Revision: http://reviews.llvm.org/D18048 llvm-svn: 263819	2016-03-18 19:21:02 +00:00
Reid Kleckner	fbd7787d7e	[codeview] Only emit function ids for inlined functions We aren't referencing any other kind of function currently. Should save a bit on our debug info size. llvm-svn: 263817	2016-03-18 18:54:32 +00:00
Colin LeMahieu	0143146514	[MCParser] Accept uppercase radix variants 0X and 0B Differential Revision: http://reviews.llvm.org/D14781 llvm-svn: 263802	2016-03-18 18:22:07 +00:00
Colin LeMahieu	307a83d76a	[llvm-objdump] Print <unknown> in place of instruction text if it couldn't be disassembled. llvm-svn: 263793	2016-03-18 16:26:48 +00:00
Nicolai Haehnle	95e8ffd398	AMDGPU: Overload return type of llvm.amdgcn.buffer.load.format Summary: Allow the selection of BUFFER_LOAD_FORMAT_x and _XY. Do this now before the frontend patches land in Mesa. Eventually, we may want to automatically reduce the size of loads at the LLVM IR level, which requires such overloads, and in some cases Mesa can generate them directly. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18255 llvm-svn: 263792	2016-03-18 16:24:40 +00:00
Nicolai Haehnle	ad63638f6d	AMDGPU/SI: Add llvm.amdgcn.buffer.atomic.* intrinsics Summary: These intrinsics expose the BUFFER_ATOMIC_* instructions and will be used by Mesa to implement atomics with buffer semantics. The intrinsic interface matches that of buffer.load.format and buffer.store.format, except that the GLC bit is not exposed (it is automatically deduced based on whether the return value is used). The change of hasSideEffects is required for TableGen to accept the pattern that matches the intrinsic. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, rivanvx, llvm-commits Differential Revision: http://reviews.llvm.org/D18151 llvm-svn: 263791	2016-03-18 16:24:31 +00:00
Nicolai Haehnle	3003ba00a3	AMDGPU: use ComplexPattern for offsets in llvm.amdgcn.buffer.load/store.format Summary: We cannot easily deduce that an offset is in an SGPR, but the Mesa frontend cannot easily make use of an explicit soffset parameter either. Furthermore, it is likely that in the future, LLVM will be in a better position than the frontend to choose an SGPR offset if possible. Since there aren't any frontend uses of these intrinsics in upstream repositories yet, I would like to take this opportunity to change the intrinsic signatures to a single offset parameter, which is then selected to immediate offsets or voffsets using a ComplexPattern. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18218 llvm-svn: 263790	2016-03-18 16:24:20 +00:00
Sam Kolton	a74cd526e9	[AMDGPU] Assembler: Change dpp_ctrl syntax to match sp3 Review: http://reviews.llvm.org/D18267 llvm-svn: 263789	2016-03-18 15:35:51 +00:00
Simon Atanasyan	05f4c803bb	[llvm-objdump] Move test case to the X86 sub-directory because it depends on X86 target supporting. NFC. llvm-svn: 263781	2016-03-18 09:52:12 +00:00
Adam Nemet	709e3046ee	[LoopDataPrefetch] Add TTI to limit the number of iterations to prefetch ahead Summary: It can hurt performance to prefetch ahead too much. Be conservative for now and don't prefetch ahead more than 3 iterations on Cyclone. Reviewers: hfinkel Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17949 llvm-svn: 263772	2016-03-18 00:27:43 +00:00
Adam Nemet	6d8beeca53	[LoopDataPrefetch/Aarch64] Allow selective prefetching of large-strided accesses Summary: And use this TTI for Cyclone. As it was explained in the original RFC (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758), the HW prefetcher work up to 2KB strides. I am also adding tests for this and the previous change (D17943): * Cyclone prefetching accesses with a large stride * Cyclone not prefetching accesses with a small stride * Generic Aarch64 subtarget not prefetching either Reviewers: hfinkel Subscribers: aemerson, rengolin, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17945 llvm-svn: 263771	2016-03-18 00:27:38 +00:00
Peter Collingbourne	a1f8625662	DebugInfo: Add ability to not emit DW_AT_vtable_elem_location for virtual functions. A virtual index of -1u indicates that the subprogram's virtual index is unrepresentable (for example, when using the relative vtable ABI), so do not emit a DW_AT_vtable_elem_location attribute for it. Differential Revision: http://reviews.llvm.org/D18236 llvm-svn: 263765	2016-03-17 23:58:03 +00:00
Tim Shen	5cdf75084a	[PPC, FastISel] Fix ordered/unordered fcmp For fcmp, major concern about the following 6 cases is NaN result. The comparison result consists of 4 bits, indicating lt, eq, gt and un (unordered), only one of which will be set. The result is generated by fcmpu instruction. However, bc instruction only inspects one of the first 3 bits, so when un is set, bc instruction may jump to to an undesired place. More specifically, if we expect an unordered comparison and un is set, we expect to always go to true branch; in such case UEQ, UGT and ULT still give false, which are undesired; but UNE, UGE, ULE happen to give true, since they are tested by inspecting !eq, !lt, !gt, respectively. Similarly, for ordered comparison, when un is set, we always expect the result to be false. In such case OGT, OLT and OEQ is good, since they are actually testing GT, LT, and EQ respectively, which are false. OGE, OLE and ONE are tested through !lt, !gt and !eq, and these are true. llvm-svn: 263753	2016-03-17 22:27:58 +00:00
Adam Nemet	b0c4eae073	[LoopVectorize] Annotate versioned loop with noalias metadata Summary: Use the new LoopVersioning facility (D16712) to add noalias metadata in the vector loop if we versioned with memchecks. This can enable some optimization opportunities further down the pipeline (see the included test or the benchmark improvement quoted in D16712). The test also covers the bug I had in the initial version in D16712. The vectorizer did not previously use LoopVersioning. The reason is that the vectorizer performs its transformations in single shot. It creates an empty single-block vector loop that it then populates with the widened, if-converted instructions. Thus creating an intermediate versioned scalar loop seems wasteful. So this patch (rather than bringing in LoopVersioning fully) adds a special interface to LoopVersioning to allow the vectorizer to add no-alias annotation while still performing its own versioning. As the vectorizer propagates metadata from the instructions in the original loop to the vector instructions we also check the pointer in the original instruction and see if LoopVersioning can add no-alias metadata based on the issued memchecks. Reviewers: hfinkel, nadav, mzolotukhin Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17191 llvm-svn: 263744	2016-03-17 20:32:37 +00:00
Adam Nemet	5eccf07df3	[LoopVersioning] Annotate versioned loop with noalias metadata Summary: If we decide to version a loop to benefit a transformation, it makes sense to record the now non-aliasing accesses in the newly versioned loop. This allows non-aliasing information to be used by subsequent passes. One example is 456.hmmer in SPECint2006 where after loop distribution, we vectorize one of the newly distributed loops. To vectorize we version this loop to fully disambiguate may-aliasing accesses. If we add the noalias markers, we can use the same information in a later DSE pass to eliminate some dead stores which amounts to ~25% of the instructions of this hot memory-pipeline-bound loop. The overall performance improves by 18% on our ARM64. The scoped noalias annotation is added in LoopVersioning. The patch then enables this for loop distribution. A follow-on patch will enable it for the vectorizer. Eventually this should be run by default when versioning the loop but first I'd like to get some feedback whether my understanding and application of scoped noalias metadata is correct. Essentially my approach was to have a separate alias domain for each versioning of the loop. For example, if we first version in loop distribution and then in vectorization of the distributed loops, we have a different set of memchecks for each versioning. By keeping the scopes in different domains they can conveniently be defined independently since different alias domains don't affect each other. As written, I also have a separate domain for each loop. This is not necessary and we could save some metadata here by using the same domain across the different loops. I don't think it's a big deal either way. Probably the best is to review the tests first to see if I mapped this problem correctly to scoped noalias markers. I have plenty of comments in the tests. Note that the interface is prepared for the vectorizer which needs the annotateInstWithNoAlias API. The vectorizer does not use LoopVersioning so we need a way to pass in the versioned instructions. This is also why the maps have to become part of the object state. Also currently, we only have an AA-aware DSE after the vectorizer if we also run the LTO pipeline. Depending how widely this triggers we may want to schedule a DSE toward the end of the regular pass pipeline. Reviewers: hfinkel, nadav, ashutosh.nema Subscribers: mssimpso, aemerson, llvm-commits, mcrosier Differential Revision: http://reviews.llvm.org/D16712 llvm-svn: 263743	2016-03-17 20:32:32 +00:00
Tim Northover	498c56c240	ARM: stop asserting on weird <3 x Ty> vectors in ISelLowering. llvm-svn: 263741	2016-03-17 20:10:28 +00:00
Guozhi Wei	7b390ec4cd	[InstCombine] Combine A->B->A BitCast This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 263734	2016-03-17 18:47:20 +00:00
Valery Pykhtin	c4546ec0cf	[AMDGPU] add VI disassembler tests. NFC. Autogenerated from the corresponding assembler tests with a few FIXME added (will fix soon). Differential Revision: http://reviews.llvm.org/D18249 llvm-svn: 263729	2016-03-17 17:56:33 +00:00
Petar Jovanovic	0b44f24033	[PowerPC] Disable CTR loops optimization for soft float operations This patch prevents CTR loops optimization when using soft float operations inside loop body. Soft float operations use function calls, but function calls are not allowed inside CTR optimized loops. Patch by Aleksandar Beserminji. Differential Revision: http://reviews.llvm.org/D17600 llvm-svn: 263727	2016-03-17 17:11:33 +00:00
Derek Schuff	d4207ba0f6	[WebAssembly] Stackify code emitted by eliminateFrameIndex and SP writeback Summary: MRI::eliminateFrameIndex can emit several instructions to do address calculations; these can usually be stackified. Because instructions with FI operands can have subsequent operands which may be expression trees, find the top of the leftmost tree and insert the code before it, to keep the LIFO property. Also use stackified registers when writing back the SP value to memory in the epilog; it's unnecessary because SP will not be used after the epilog, and it results in better code. Differential Revision: http://reviews.llvm.org/D18234 llvm-svn: 263725	2016-03-17 17:00:29 +00:00
Changpeng Fang	234fcb81d3	AMDGPU/SI: Do not generate s_waitcnt after ds_permute/ds_bpermute Symmary: ds_permute/ds_bpermute do not read memory so s_waitcnt is not needed. Reviewers arsenm, tstellarAMD Subscribers llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D18197 llvm-svn: 263720	2016-03-17 16:43:50 +00:00
Nicolai Haehnle	79cad857a0	AMDGPU: mark atomic instructions as sources of divergence Summary: As explained by the comment, threads will typically see different values returned by atomic instructions even if the arguments are equal. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18156 llvm-svn: 263719	2016-03-17 16:21:59 +00:00
Simon Pilgrim	0f37fbac51	[X86][SSE] Simplified blend-with-zero combining We were being too aggressive in trying to combine a shuffle into a blend-with-zero pattern, often resulting in a endless loop of contrasting combines This patch stops the combine if we already have a blend in place (means we miss some domain corrections) llvm-svn: 263717	2016-03-17 15:59:36 +00:00
Sanjay Patel	9e23fedaf0	propagate 'unpredictable' metadata on select instructions This is similar to D18133 where we allowed profile weights on select instructions. This extends that change to also allow the 'unpredictable' attribute of branches to apply to selects. A test to check that 'unpredictable' metadata is preserved when cloning instructions was checked in at: http://reviews.llvm.org/rL263648 Differential Revision: http://reviews.llvm.org/D18220 llvm-svn: 263716	2016-03-17 15:30:52 +00:00
Saleem Abdulrasool	071a099102	ARM: Revert SVN r253865, 254158, fix windows division The two changes together weakened the test and caused a regression with division handling in MSVC mode. They were applied to avoid an assertion being triggered in the block frequency analysis. However, the underlying problem was simply being masked rather than solved properly. Address the actual underlying problem and revert the changes. Rather than analyze the cause of the assertion, the division failure was assumed to be an overflow. The underlying issue was a subtle bug in the BB construction in the emission of the div-by-zero check (WIN__DBZCHK). We did not construct the proper successor information in the basic blocks, nor did we update the PHIs associated with the basic block when we split them. This would result in assertions being triggered in the block frequency analysis pass. Although the original tests are being removed, the tests themselves performed very little in terms of validation but merely tested that we did not assert when generating code. Update this with new tests that actually ensure that we do not regress on the code generation. llvm-svn: 263714	2016-03-17 14:10:49 +00:00
Simon Atanasyan	16f2460575	[llvm-objdump] Add REQUIRES x86 directive to fix buildbots llvm-svn: 263708	2016-03-17 11:09:21 +00:00
Simon Atanasyan	34223a7e5d	[llvm-objdump] Add '0x' prefix to a target displacement number to accent its hex format It might be hard to recognize a hexadecimal number without '0x' prefix. Besides that '0x' prefix corresponds to GNU objdump behaviour. Differential Revision: http://reviews.llvm.org/D18207 llvm-svn: 263705	2016-03-17 10:43:44 +00:00
Simon Atanasyan	58ee875296	[mips] Use `formatImm` call to print immediate value in the `MipsInstPrinter` That allows, for example, to print hex-formatted immediates using llvm-objdump --print-imm-hex command line option. Differential Revision: http://reviews.llvm.org/D18195 llvm-svn: 263704	2016-03-17 10:43:36 +00:00
David Majnemer	6f66f0a343	[yaml2obj, COFF] Correctly handle section alignment The section alignment field was marked optional but not provided a default value: initialize it with 0. While we are here, ensure that the section alignment is plausible. llvm-svn: 263692	2016-03-17 05:43:26 +00:00
Sanjay Patel	3b32ebb97b	use FileCheck for tighter checking llvm-svn: 263679	2016-03-16 23:39:37 +00:00
Sanjay Patel	b672e792f2	reduce check strings; no need to check IR comments llvm-svn: 263675	2016-03-16 23:22:01 +00:00
Sanjay Patel	355c77e796	use FileCheck for tighter checking llvm-svn: 263674	2016-03-16 23:20:20 +00:00
Chris Bieneman	671d0dda7d	Upgrade TBAA before upgrading intrinsics Summary: If TBAA is on an intrinsic and it gets upgraded and drops the TBAA we hit an odd assert. We should just upgrade the TBAA first because it doesn't have side-effects. Reviewers: reames, apilipenko, manmanren Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18229 llvm-svn: 263673	2016-03-16 23:17:54 +00:00
Sanjay Patel	6cec10572f	use FileCheck for tighter checking I'm testing out a script that auto-generates the check lines. It's 98% copied from utils/update_llc_test_checks.py. If others think this is useful, please let me know. llvm-svn: 263668	2016-03-16 22:34:57 +00:00
Sanjay Patel	cb775fcf22	use FileCheck for tighter checking I'm testing out a script that auto-generates the check lines. It's 98% copied from utils/update_llc_test_checks.py. If others think this is useful, please let me know. llvm-svn: 263667	2016-03-16 22:29:07 +00:00
Nicolai Haehnle	ef160de3e5	AMDGPU: Prevent uniform loops from becoming infinite Summary: Uniform loops where the branch leaving the loop is predicated on VCCNZ must be skipped if EXEC = 0, otherwise they will be infinite. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18137 llvm-svn: 263658	2016-03-16 20:14:33 +00:00
Geoff Berry	56fabf9b55	Revert "[LSR] Create fewer redundant instructions." This reverts commit r263644. Investigating bootstrap failures. llvm-svn: 263655	2016-03-16 19:21:47 +00:00
Simon Pilgrim	53f27fc00f	[X86] Reduced alignment of widened vector load/stores to better match PR26953 cases llvm-svn: 263649	2016-03-16 18:32:44 +00:00
Sanjay Patel	fd24fb1b2b	add checks for 'unpredictable' metadata preservation llvm-svn: 263648	2016-03-16 18:15:34 +00:00
Geoff Berry	459b750871	[LSR] Create fewer redundant instructions. Summary: Fix LSRInstance::HoistInsertPosition() to check the original insert position block first for a canonical insertion point that is dominated by all inputs. This leads to SCEV being able to reuse more instructions since it currently tracks the instructions it creates for reuse by keeping a table of <Value, insert point> pairs. Reviewers: atrick Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18001 llvm-svn: 263644	2016-03-16 17:29:49 +00:00
Simon Pilgrim	60228bdb80	[X86] Regenerated + extended widened vector conversion tests - Ensure we test X86 + X64 - sitopfp / uitofp requires testing for SSE2 and SSE42 as well (part of the fix for PR26953) llvm-svn: 263640	2016-03-16 15:33:43 +00:00
Igor Breger	0ba7b04f5f	AVX512BW: Fix SRA v64i8 lowering. Use PCMPGTM (cmp result in k register) for 512bit vector because PCMPGT supported only for 128/256bit. Differential Revision: http://reviews.llvm.org/D18204 llvm-svn: 263624	2016-03-16 08:48:26 +00:00
Vedant Kumar	ddfec8cb55	[Bitcode] Add compatibility test for the 3.8 release Fork off compatibility.ll for the 3.8 release. The *.bc file in this commit was produced using a Release build of the release_38 branch. llvm-svn: 263620	2016-03-16 05:43:03 +00:00
Haicheng Wu	7873857a88	[JumpThreading] See through Cast Instructions To capture more jump-thread opportunity. llvm-svn: 263618	2016-03-16 04:52:52 +00:00
Simon Pilgrim	39d2411c3b	[X86] Regenerated widen load tests llvm-svn: 263608	2016-03-16 00:41:21 +00:00
Simon Pilgrim	f7cb16f7de	[X86][SSE41] Additional tests for extracting zeroable shuffle elements We can currently only match zeroable vector elements of the same size as the shuffle type - these tests demonstrate the problem and a solution will be shortly added in an updated D14261 llvm-svn: 263606	2016-03-16 00:13:36 +00:00
Haicheng Wu	64d9d7c3f7	Revert "[JumpThreading] Simplify Instructions first in ComputeValueKnownInPredecessors()" Not sure it handles undef properly. llvm-svn: 263605	2016-03-15 23:38:47 +00:00
Evgeniy Stepanov	d6e91369d8	[msan] Don't put module constructors in comdats. There is something strange going on with debug info (.eh_frame_hdr) disappearing when msan.module_ctor are placed in comdat sections. Moving this functionality under flag, disabled by default. llvm-svn: 263579	2016-03-15 20:25:47 +00:00
Quentin Colombet	fdc838e97f	[MIR] Add a test case for the diagnostic of a wrongly typed generic instruction llvm-svn: 263573	2016-03-15 18:31:29 +00:00
Quentin Colombet	a3f2abad55	[AArch64] Move GlobalISel test cases into a GlobalISel subdirectory llvm-svn: 263572	2016-03-15 18:30:00 +00:00
Adam Nemet	fdb20595a1	[LV] Preserve LoopInfo when store predication is used This was a latent bug that got exposed by the change to add LoopSimplify as a dependence to LoopLoadElimination. Since LoopInfo was corrupted after LV, LoopSimplify mis-compiled nbench in the test-suite (more details in the PR). The problem was that when we create the blocks for predicated stores we didn't add those to any loops. The original testcase for store predication provides coverage for this assuming we verify LI on the way out of LV. Fixes PR26952. llvm-svn: 263565	2016-03-15 18:06:20 +00:00
Changpeng Fang	01f6062227	AMDGPU/SI: Implement GroupStaticSize Intrinsic for Dynamic LDS Summary: Static LDS size is saved in MachineFunctionInfo::LDSSize, We define a pseudo instruction with usesCustomInserter bit set. Then, in EmitInstrWithCustomInserter, we replace this pseudo instruction with a mov of MachineFunctionInfo::LDSSize. Reviewers: arsenm tstellarAMD Subscribers llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D18064 llvm-svn: 263563	2016-03-15 17:28:44 +00:00
Hemant Kulkarni	c030f23b8b	[llvm-readobj] Impl GNU style printing of sections and relocations Differential Revision: http://reviews.llvm.org/D17523 llvm-svn: 263561	2016-03-15 17:25:31 +00:00
Benjamin Kramer	96f4b12880	[GlobalOpt] Don't look through aliases when sorting names of globals. If both are different aliases to the same value the sorting becomes non-deterministic as array_pod_sort is not stable. llvm-svn: 263550	2016-03-15 14:18:26 +00:00
Nikolay Haustov	cb9dddb1d7	[AMDGPU] Assembler: Update SOP* tests Add VI encodings. Reformat sopp.s to match style of other files. Differential Revision: http://reviews.llvm.org/D18084 llvm-svn: 263540	2016-03-15 07:44:57 +00:00
David Majnemer	0ab61bfb37	[llvm-objdump] Add support for dumping the PE TLS directory The PE TLS directory contains information about where the TLS data resides in the image, what functions should be executed when threads are created, etc. llvm-svn: 263537	2016-03-15 06:14:01 +00:00
Lang Hames	abda4d2526	[MachO] Extend the alt_entry support for aliases added in r263521 to expressions of the form 'a = .' and 'a = Ltmp'. llvm-svn: 263528	2016-03-15 04:20:49 +00:00
Lang Hames	1b640e05ba	[MachO] Add MachO alt-entry directive support. This patch adds support for the MachO .alt_entry assembly directive, and uses it for global aliases with non-zero GEP offsets. The alt_entry flag indicates that a symbol should be layed out immediately after the preceding symbol. Conceptually it introduces an alternate entry point for a function or data structure. E.g.: safe_foo: // check preconditions for foo .alt_entry fast_foo fast_foo: // body of foo, can assume preconditions. The .alt_entry flag is also implicitly set on assembly aliases of the form: a = b + C where C is a non-zero constant, since these have the same effect as an alt_entry symbol: they introduce a label that cannot be moved relative to the preceding one. Setting the alt_entry flag on aliases of this form fixes http://llvm.org/PR25381. llvm-svn: 263521	2016-03-15 01:43:05 +00:00
Teresa Johnson	26ab5772b0	[ThinLTO] Renaming of function index to module summary index (NFC) (Resubmitting after fixing missing file issue) With the changes in r263275, there are now more than just functions in the summary. Completed the renaming of data structures (started in r263275) to reflect the wider scope. In particular, changed the FunctionIndex* data structures to ModuleIndex*, and renamed related variables and comments. Also renamed the files to reflect the changes. A companion clang patch will immediately succeed this patch to reflect this renaming. llvm-svn: 263513	2016-03-15 00:04:37 +00:00
Eric Christopher	da8b3f1914	Temporarily Revert "[X86][SSE] Simplify vector LOAD + EXTEND on pre-SSE41 hardware" as it seems to be causing crashes during code generation in halide. PR forthcoming. This reverts commit r263303. llvm-svn: 263512	2016-03-14 23:59:57 +00:00
Justin Lebar	6827de19b2	[LoopUnroll] Respect the convergent attribute. Summary: Specifically, when we perform runtime loop unrolling of a loop that contains a convergent op, we can only unroll k times, where k divides the loop trip multiple. Without this change, we'll happily unroll e.g. the following loop for (int i = 0; i < N; ++i) { if (i == 0) convergent_op(); foo(); } into int i = 0; if (N % 2 == 1) { convergent_op(); foo(); ++i; } for (; i < N - 1; i += 2) { if (i == 0) convergent_op(); foo(); foo(); }. This is unsafe, because we've just added a control-flow dependency to the convergent op in the prelude. In general, runtime unrolling loops that contain convergent ops is safe only if we don't have emit a prelude, which occurs when the unroll count divides the trip multiple. Reviewers: resistor Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17526 llvm-svn: 263509	2016-03-14 23:15:34 +00:00
Amaury Sechet	bdb261b4c0	Imporove load to store => memcpy Summary: This now try to reorder instructions in order to help create the optimizable pattern. Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph, majnemer Differential Revision: http://reviews.llvm.org/D16523 llvm-svn: 263503	2016-03-14 22:52:27 +00:00
Teresa Johnson	cec0cae313	Revert "[ThinLTO] Renaming of function index to module summary index (NFC)" This reverts commit r263490. Missed a file. llvm-svn: 263493	2016-03-14 21:18:10 +00:00
Teresa Johnson	892920b358	[ThinLTO] Renaming of function index to module summary index (NFC) With the changes in r263275, there are now more than just functions in the summary. Completed the renaming of data structures (started in r263275) to reflect the wider scope. In particular, changed the FunctionIndex* data structures to ModuleIndex*, and renamed related variables and comments. Also renamed the files to reflect the changes. A companion clang patch will immediately succeed this patch to reflect this renaming. llvm-svn: 263490	2016-03-14 21:05:56 +00:00
Sanjay Patel	ee52b6e77d	allow branch weight metadata on select instructions (PR26636) As noted in: https://llvm.org/bugs/show_bug.cgi?id=26636 This doesn't accomplish anything on its own. It's the first step towards preserving and using branch weights with selects. The next step would be to make sure we're propagating the info in all of the other places where we create selects (SimplifyCFG, InstCombine, etc). I don't think there's an easy fix to make this happen; we have to look at each transform individually to determine how to correctly propagate the weights. Along with that step, we need to then use the weights when making subsequent transform decisions such as discussed in http://reviews.llvm.org/D16836. The inliner test is independent but closely related. It verifies that metadata is preserved when both branches and selects are cloned. Differential Revision: http://reviews.llvm.org/D18133 llvm-svn: 263482	2016-03-14 20:18:59 +00:00
Justin Lebar	9d94397859	[attrs] Handle convergent CallSites. Summary: Previously we had a notion of convergent functions but not of convergent calls. This is insufficient to correctly analyze calls where the target is unknown, e.g. indirect calls. Now a call is convergent if it targets a known-convergent function, or if it's explicitly marked as convergent. As usual, we can remove convergent where we can prove that no convergent operations are performed in the call. Originally landed as r261544, then reverted in r261544 for (incidental) build breakage. Re-landed here with no changes. Reviewers: chandlerc, jingyue Subscribers: llvm-commits, tra, jhen, hfinkel Differential Revision: http://reviews.llvm.org/D17739 llvm-svn: 263481	2016-03-14 20:18:54 +00:00
Michael Kuperstein	b7860fedd4	[AliasSetTracker] Do not strip pointer casts when processing MemSetInst This fixes PR26843. llvm-svn: 263462	2016-03-14 18:34:29 +00:00
Chad Rosier	27c352d26d	[AArch64] Refactor AArch64FrameLowering::emitPrologue. NFC. http://reviews.llvm.org/D18125 Patch by Aditya Kumar. llvm-svn: 263461	2016-03-14 18:24:34 +00:00
Chad Rosier	6d98655070	[AArch64] Break the dependency between FP and SP when possible. When the SP in not changed because of realignment/VLAs etc., we restore the SP by using the previous value of SP and not the FP. Breaking the dependency will help in cases when the epilog of a callee is close to the epilog of the caller; for then "sub sp, fp, #" depends on the load restoring the FP in the epilog of the callee. http://reviews.llvm.org/D18060 Patch by Aditya Kumar and Evandro Menezes. llvm-svn: 263458	2016-03-14 18:17:41 +00:00
Tom Stellard	331f981cc9	AMDGPU/SI: Handle wait states required for DPP instructions Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17543 llvm-svn: 263447	2016-03-14 17:05:56 +00:00
Sanjay Patel	62d707c8d9	[x86, AVX] replace masked load with full vector load when possible Converting masked vector loads to regular vector loads for x86 AVX should always be a win. I raised the legality issue of reading the extra memory bytes on llvm-dev. I did not see any objections. 1. x86 already does this kind of optimization for multiple scalar loads -> vector load. 2. If other targets have the same flexibility, we could move this transform up to CGP or DAGCombiner. Differential Revision: http://reviews.llvm.org/D18094 llvm-svn: 263446	2016-03-14 16:54:43 +00:00
Daniel Sanders	e8efff373a	[mips] MIPS32R6 compact branch support Summary: MIPSR6 introduces a class of branches called compact branches. Unlike the traditional MIPS branches which have a delay slot, compact branches do not have a delay slot. The instruction following the compact branch is only executed if the branch is not taken and must not be a branch. It works by generating compact branches for MIPS32R6 when the delay slot filler cannot fill a delay slot. Then, inspecting the generated code for forbidden slot hazards (a compact branch with an adjacent branch or other CTI) and inserting nops to clear this hazard. Patch by Simon Dardis. Reviewers: vkalintiris, dsanders Subscribers: MatzeB, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D16353 llvm-svn: 263444	2016-03-14 16:24:05 +00:00
Marek Olsak	ed2213e6ef	AMDGPU/SI: Incomplete shader binaries need to finish execution at the end Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D18058 llvm-svn: 263441	2016-03-14 15:57:14 +00:00
Nicolai Haehnle	74127fe8d7	AMDGPU: mark llvm.amdgcn.image.atomic.* as a source of divergence Summary: When multiple threads perform an atomic op with the same arguments, they will usually see different return values. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18101 llvm-svn: 263440	2016-03-14 15:37:18 +00:00
Benjamin Kramer	1082fa66a5	Revert "Recommitted r261633 "Supporting all entities declared in lexical scope in LLVM debug info." After fixing PR26715 at r263379." This reverts commit r263424. Breaks self-host. llvm-svn: 263437	2016-03-14 14:58:36 +00:00
Ulrich Weigand	cdce026b4d	[SystemZ] Avoid LER on z13 due to partial register dependencies On the z13, it turns out to be more efficient to access a full floating-point register than just the upper half (as done e.g. by the LE and LER instructions). Current code already takes this into account when loading from memory by using the LDE instruction in place of LE. However, we still generate LER, which shows the same performance issues as LE in certain circumstances. This patch changes the back-end to emit LDR instead of LER to implement FP32 register-to-register copies on z13. llvm-svn: 263431	2016-03-14 13:50:03 +00:00
Zlatko Buljan	fba68931ed	[mips] Fix an issue with long double when function roundl is defined Differential Revision: http://reviews.llvm.org/D17760 llvm-svn: 263428	2016-03-14 12:50:23 +00:00
Daniel Sanders	127d2d2b46	[mips] Range check uimm16_64 Summary: Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D17725 llvm-svn: 263427	2016-03-14 12:44:44 +00:00
Amjad Aboud	ab0378b16c	Recommitted r261633 "Supporting all entities declared in lexical scope in LLVM debug info." After fixing PR26715 at r263379. llvm-svn: 263424	2016-03-14 12:03:20 +00:00
Nikolay Haustov	79af6b33e0	[AMDGPU] Assembler: SOP* instruction fixes s_bitset0_b64, s_bitset1_b64 has 32-bit src0, not 64-bit. s_rfe_b64 has just one destination operand and no source. Uncomment S_BITCMP* and S_SETVSKIP, adjust SOPC_* classes for that. Add s_memrealtime test and change comments in smem.s to follow common style. Change test for s_memtime to use non-zero register to make it really test encoding. Add tests for s_buffer_load*. Add tests for SOPC instructions (same for SI and VI) Differential Revision: http://reviews.llvm.org/D18040 llvm-svn: 263420	2016-03-14 11:17:19 +00:00
Daniel Sanders	19b7f76afa	[mips] Range check uimm6_lsl2. Summary: Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D17291 llvm-svn: 263419	2016-03-14 11:16:56 +00:00
Igor Breger	a949100532	AVX512: icmp operation should be always lowered to CMPM (AVX-512) instruction on SKX. implemented by delena Differential Revision: http://reviews.llvm.org/D18054 llvm-svn: 263417	2016-03-14 10:26:39 +00:00
Haicheng Wu	d60ae33d29	[CVP] Convert an SDiv to a UDiv if both operands are known to be nonnegative The motivating example is this for (j = n; j > 1; j = i) { i = j / 2; } The signed division is safely to be changed to an unsigned division (j is known to be larger than 1 from the loop guard) and later turned into a single shift without considering the sign bit. llvm-svn: 263406	2016-03-14 03:24:28 +00:00
Simon Pilgrim	9b7aaafc6a	[X86][XOP] Added target shuffle combine tests for XOP's VPPERM 2-op shuffle Actual combing support will be added in a future patch llvm-svn: 263402	2016-03-14 00:18:26 +00:00
Simon Pilgrim	5f1326f8cc	[X86][SSE] Added truncated vector arithmetic tests. For cases where we are truncating an integer vector arithmetic result, it may be better to pre-truncate the input operands - no code to support this yet (scalar is done with SimplifyDemandedBits but adding vector support could be a lot of work) but these tests represent the current codegen status. Example bugs: PR14666, PR22703 llvm-svn: 263384	2016-03-13 19:08:01 +00:00
Simon Pilgrim	035b19ecf5	[X86][SSE41] Avoid variable blend for constant v8i16 shifts The SSE41 v8i16 shift lowering using (v)pblendvb is great for non-constant shift amounts, but if it is constant then we can efficiently reduce the VSELECT to shuffles with the pre-SSE41 lowering. llvm-svn: 263383	2016-03-13 18:35:59 +00:00
Amaury Sechet	b325686764	Add echo test for constant data arrays in the LLVM C API llvm-svn: 263350	2016-03-13 00:58:25 +00:00
Sanjay Patel	610da4fbaf	update test to use FileCheck llvm-svn: 263347	2016-03-12 21:09:26 +00:00
Sanjay Patel	c4acbae63f	[x86, InstCombine] delete x86 SSE2 masked store with zero mask This follows up on the related AVX instruction transforms, but this one is too strange to do anything more with. Intel's behavioral description of this instruction in its Software Developer's Manual is tragi-comic. llvm-svn: 263340	2016-03-12 15:16:59 +00:00
Nemanja Ivanovic	bd56e4e25a	Fix for PR 26378 This patch corresponds to review: http://reviews.llvm.org/D17712 We were not clearing the TOC vector in PPCAsmPrinter when initializing it. This caused duplicate definition asserts when the pass is reused on the module (i.e. with -compile-twice or in JIT contexts). llvm-svn: 263338	2016-03-12 10:23:07 +00:00

... 3 4 5 6 7 ...

35421 Commits