llvm-project

Commit Graph

Author	SHA1	Message	Date
Eric Christopher	7e70aba1a8	Recommit r231324 with a fix to the ARM execution domain code to disable lane switching if we don't actually have the instruction set we want to switch to. Models the earlier check above the conditional for the pass. The testcase is one that triggered with the assert that's added as part of the fix, use it to avoid adding a new testcase as it highlights the same problem. llvm-svn: 231539	2015-03-07 00:12:22 +00:00
Olivier Sallenave	049d803ce0	Do not restrict interleaved unrolling to small loops, depending on the target. llvm-svn: 231528	2015-03-06 23:12:04 +00:00
Quentin Colombet	66b616351c	[AArch64][LoadStoreOptimizer] Generate LDP + SXTW instead of LD[U]R + LD[U]RSW. Teach the load store optimizer how to sign extend a result of a load pair when it helps creating more pairs. The rational is that loads are more expensive than sign extensions, so if we gather some in one instruction this is better! <rdar://problem/20072968> llvm-svn: 231527	2015-03-06 22:42:10 +00:00
Matthias Braun	898d11e864	DAGCombiner: Canonicalize select(and/or,x,y) depending on target. This is based on the following equivalences: select(C0 & C1, X, Y) <=> select(C0, select(C1, X, Y), Y) select(C0 \| C1, X, Y) <=> select(C0, X, select(C1, X, Y)) Many target cannot perform and/or on the CPU flags and therefore the right side should be choosen to avoid materializign the i1 flags in an integer register. If the target can perform this operation efficiently we normalize to the left form. Differential Revision: http://reviews.llvm.org/D7622 llvm-svn: 231507	2015-03-06 19:49:10 +00:00
Matthias Braun	3ecb557739	DAGCombiner: Factor out some and/or combines. This is in preparation for changing visitSELECT to normalize towards select(Cond0, select(Cond1, X, Y), Y); select(Cond0, X, select(Cond1, X, Y)) which perfom an implicit and/or of the conditions. The factored function contains all DAGCombine rules which reduce two values combined by an And/Or operation to a single value. This does not include rules involving constants as visitSELECT already handles that case. Differential Revision: http://reviews.llvm.org/D8026 llvm-svn: 231506	2015-03-06 19:49:06 +00:00
Benjamin Kramer	e8a64a20f2	LoopInterchange: Remove empty method. llvm-svn: 231503	2015-03-06 19:37:26 +00:00
Benjamin Kramer	79442920bf	LoopInterchange: Rephrase instruction moving using ilist's splice and factor it into a function + Random cleanups. No functional change. llvm-svn: 231501	2015-03-06 18:59:14 +00:00
Matthias Braun	046318b87e	ExecutionDepsFix: Indizes -> Indices. Translate german to english. llvm-svn: 231500	2015-03-06 18:56:20 +00:00
Eric Christopher	6a8bfe7198	Fix typo. llvm-svn: 231495	2015-03-06 18:20:23 +00:00
Tom Stellard	6b42f2d8aa	R600/SI: Remove unused register class llvm-svn: 231491	2015-03-06 17:00:16 +00:00
Benjamin Kramer	298a3a0567	Fold init() helpers into constructors. NFC. llvm-svn: 231486	2015-03-06 16:21:15 +00:00
Chad Rosier	99b3e022c4	Avoid calls to dumpPassInfo and RegionBase<Tr>::getNameStr() in RGPassManager if -debug-pass is not specified, as the string is only used when dumping pass information. There is a big cost of determining the name in ReginBase<Tr>:getNameStr() if the region's entry or exit block doesn't have a name. This is the case for the Release build, as names are not preserved by the front-end. RegionPass is mainly used by Polly, resulting in long compile time for one file of a customer application with the Release build (1m24s) vs Release+Asserts build (10s) when Polly is used. With this change, the compile time with the Release build went down to 8s. Patch by Sanjin Sijaric <ssijaric@codeaurora.org>! Phabricator: http://reviews.llvm.org/D8076 llvm-svn: 231485	2015-03-06 16:15:04 +00:00
James Molloy	dcc78ec386	[ConstantRange] Teach multiply to be cleverer about signed ranges. Multiplication is not dependent on signedness, so just treating all input ranges as unsigned is not incorrect. However it will cause overly pessimistic ranges (such as full-set) when used with signed negative values. Teach multiply to try to interpret its inputs as both signed and unsigned, and then to take the most specific (smallest population) as its result. llvm-svn: 231483	2015-03-06 15:50:47 +00:00
Bruno Cardoso Lopes	618c67a018	[AsmPrinter][TLOF] 32-bit MachO support for replacing GOT equivalents Add MachO 32-bit (i.e. arm and x86) support for replacing global GOT equivalent symbol accesses. Unlike 64-bit targets, there's no GOTPCREL relocation, and access through a non_lazy_symbol_pointers section is used instead. -- before _extgotequiv: .long _extfoo _delta: .long _extgotequiv-_delta -- after _delta: .long L_extfoo$non_lazy_ptr-_delta .section __IMPORT,__pointers,non_lazy_symbol_pointers L_extfoo$non_lazy_ptr: .indirect_symbol _extfoo .long 0 llvm-svn: 231475	2015-03-06 13:49:05 +00:00
Bruno Cardoso Lopes	52b1391df6	[AsmPrinter][TLOF] ARM64 MachO support for replacing GOT equivalents Follow up r230264 and add ARM64 support for replacing global GOT equivalent symbol accesses by references to the GOT entry for the final symbol instead, example: -- before .globl _foo _foo: .long 42 .globl _gotequivalent _gotequivalent: .quad _foo .globl _delta _delta: .long _gotequivalent-_delta -- after .globl _foo _foo: .long 42 .globl _delta Ltmp3: .long _foo@GOT-Ltmp3 llvm-svn: 231474	2015-03-06 13:48:45 +00:00
Toma Tabacu	4e0cf8e211	[mips] [IAS] Add missing constraints and improve testing for the .module directive. Summary: None of the .set directives can be used before the .module directives. The .set mips0/pop/push were not triggering this constraint. Also added testing for all the other implemented directives which are supposed to trigger this constraint. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7140 llvm-svn: 231465	2015-03-06 12:15:12 +00:00
Daniel Jasper	6adbd7aecf	Change the way in which error case is being handled. Specifically this: * Prevents an "unused" warning in non-assert builds. * In that error case return with out removing a child loop instead of looping forever. llvm-svn: 231459	2015-03-06 10:39:14 +00:00
Karthik Bhat	88db86dd29	Add a new pass "Loop Interchange" This pass interchanges loops to provide a more cache-friendly memory access. For e.g. given a loop like - for(int i=0;i<N;i++) for(int j=0;j<N;j++) A[j][i] = A[j][i]+B[j][i]; is interchanged to - for(int j=0;j<N;j++) for(int i=0;i<N;i++) A[j][i] = A[j][i]+B[j][i]; This pass is currently disabled by default. To give a brief introduction it consists of 3 stages- LoopInterchangeLegality : Checks the legality of loop interchange based on Dependency matrix. LoopInterchangeProfitability: A very basic heuristic has been added to check for profitibility. This will evolve over time. LoopInterchangeTransform : Which does the actual transform. LNT Performance tests shows improvement in Polybench/linear-algebra/kernels/mvt and Polybench/linear-algebra/kernels/gemver becnmarks. TODO: 1) Add support for reductions and lcssa phi. 2) Improve profitability model. 3) Improve loop selection algorithm to select best loop for interchange. Currently the innermost loop is selected for interchange. 4) Improve compile time regression found in llvm lnt due to this pass. 5) Fix issues in Dependency Analysis module. A special thanks to Hal for reviewing this code. Review: http://reviews.llvm.org/D7499 llvm-svn: 231458	2015-03-06 10:11:25 +00:00
David Majnemer	b61f4e403d	X86: Form IMGREL relocations for LLVM Functions We supported forming IMGREL relocations from ConstantExprs involving __ImageBase if the minuend was a GlobalVariable. Extend this functionality to all GlobalObjects. llvm-svn: 231456	2015-03-06 08:11:32 +00:00
Yaron Keren	322bdad085	Silence C4715 'not all control paths return a value' warnings. llvm-svn: 231455	2015-03-06 07:49:14 +00:00
Rui Ueyama	da9bc2e56d	Support: Improve performance of FileOutputBuffer on Windows We extend an underlying file before mmap'ing it, but it's not needed on Windows. Extending file is slow on Windows, so we should avoid doing that. The difference gets larger as the size of an output file gets larger. It shove off 2 seconds out of 25 seconds when linking chrome.dll with LLD, for example. llvm-svn: 231452	2015-03-06 06:07:32 +00:00
Michael Gottesman	6ff10c959a	[objc-arc] Sprinkle some more auto on some iterators. llvm-svn: 231447	2015-03-06 02:10:03 +00:00
Michael Gottesman	16e6a2057f	[objc-arc] Move the detection of potential uses or altering of a ref count onto PtrState. llvm-svn: 231446	2015-03-06 02:07:12 +00:00
Michael Zolotukhin	03dd1082ad	LegalizeTypes: Handle shift by 0 in ExpandShiftByConstant. Though such shifts are usually optimized away by combiner, we still can encounter them after a vector shift is legalized. llvm-svn: 231443	2015-03-06 01:13:01 +00:00
Rafael Espindola	a5b9e1cf39	Remember to move a type to the correct set when setting the body. We would set the body of a struct type (therefore making it non-opaque) but were forgetting to move it to the non-opaque set. Fixes pr22807. llvm-svn: 231442	2015-03-06 00:50:21 +00:00
Michael Gottesman	6080596328	[objc-arc] Move the checking of whether or not we can match onto PtrStates and out of the main dataflow. These refactored computations check whether or not we are at a stage of the sequence where we can perform a match. This patch moves the computation out of the main dataflow and into {BottomUp,TopDown}PtrState. llvm-svn: 231439	2015-03-06 00:34:42 +00:00
Michael Gottesman	4eae396ae9	[objc-arc] Refactor (Re-)initialization of PtrState from dataflow -> {TopDown,BottomUp}PtrState Class. This initialization occurs when we see a new retain or release. Before we performed the actual initialization inline in the dataflow. That is just messy. llvm-svn: 231438	2015-03-06 00:34:39 +00:00
Michael Gottesman	feb138e211	[objc-arc] Create two subclasses of PtrState in preparation for moving per ptr state change behavior onto a PtrState class. This will enable the main ObjCARCOpts dataflow to work with higher level concepts such as "can this ptr state be modified by this ref count" and not need to understand the nitty gritty details of how that is determined. This makes the dataflow cleaner. llvm-svn: 231437	2015-03-06 00:34:36 +00:00
Michael Gottesman	41c01005ed	[objc-arc] Extract out MDNodes into a cache structure so the information can be passed around. llvm-svn: 231436	2015-03-06 00:34:33 +00:00
Michael Gottesman	f6bcb81000	[objc-arc] Remove annotations code. It will always be in the history if it is needed again. Now it is just dead code. llvm-svn: 231435	2015-03-06 00:34:29 +00:00
Nadav Rotem	c99a38796c	Teach ComputeNumSignBits about signed reminder. This optimization a continuation of r231140 that reasoned about signed div. llvm-svn: 231433	2015-03-06 00:23:58 +00:00
Michael Gottesman	d45907bd38	Fix build error. llvm-svn: 231430	2015-03-05 23:57:07 +00:00
Michael Gottesman	a9fc016281	[objc-arc] Change some casts and loop iterators to use auto. llvm-svn: 231427	2015-03-05 23:29:06 +00:00
Michael Gottesman	68b91dbf84	[objc-arc] Extract out state specific to a ref count from the main objc arc sequence dataflow. This will allow me to separate the actual ARC queries from the meat of the dataflow algorithm. llvm-svn: 231426	2015-03-05 23:29:03 +00:00
Michael Gottesman	0be6920e23	[objc-arc] Extract blot map vector into its own file. NFC. llvm-svn: 231425	2015-03-05 23:28:58 +00:00
Ahmed Bougacha	c6dcf7a7cc	[X86] Remove stale comment. NFC. It turns out 256bit V[SZ]EXT nodes are still generated by the new shuffle lowering, so this is here to stay! llvm-svn: 231422	2015-03-05 23:18:41 +00:00
Benjamin Kramer	fc165f1434	Instructions: Use delegated constructors to reduce duplication NFC. llvm-svn: 231411	2015-03-05 22:05:26 +00:00
Sanjay Patel	302404b277	[AVX] Lower / fast-isel scalar FP selects into VBLENDV instructions (PR22483) This patch reduces code size for all AVX targets and increases speed for some chips. SSE 4.1 introduced the useless (see code comments) 2-register form of BLENDV and only in the packed float/double flavors. AVX subsequently made the instruction useful by adding a 4-register operand form. So we just need to paper over the lack of scalar forms of this instruction, complicate the code to choose float or double forms, and use blendv on scalars since all FP is in xmm registers anyway. This gives us an approximately 50% speed up for a blendv microbenchmark sequence on SandyBridge and Haswell: blendv : 29.73 cycles/iter logic : 43.15 cycles/iter No new test cases with this patch because: 1. fast-isel-select-sse.ll tests the positive side for regular X86 lowering and fast-isel 2. sse-minmax.ll and fp-select-cmp-and.ll confirm that we're not firing for scalar selects without AVX 3. fp-select-cmp-and.ll and logical-load-fold.ll confirm that we're not firing for scalar selects with constants. http://llvm.org/bugs/show_bug.cgi?id=22483 Differential Revision: http://reviews.llvm.org/D8063 llvm-svn: 231408	2015-03-05 21:46:54 +00:00
Benjamin Kramer	fb0abceb5c	SelectionDAGBuilder: Merge 3 copies of the limited precision exp2 emission code. NFC intended. llvm-svn: 231406	2015-03-05 21:13:08 +00:00
Andrew Kaylor	05ee8bd4e3	Fix uninitialized memory references in WinEHPrepare llvm-svn: 231405	2015-03-05 21:06:42 +00:00
Benjamin Kramer	c54c38e090	SDAG: Merge the meat of two ExpandAtomic implementations. The copies already diverged, don't let them become any worse. Reduce redundancy in code with a little macro metaprogramming. llvm-svn: 231401	2015-03-05 20:04:29 +00:00
Ahmed Bougacha	1b67630cb3	[AArch64] Teach AsmPrinter about GlobalAddress operands. Fixes PR22761, rdar://20024866. Differential Revision: http://reviews.llvm.org/D8042 llvm-svn: 231400	2015-03-05 20:04:21 +00:00
Rafael Espindola	092b619e55	Use the correct func begin symbol in all places in ppc. I missed an occurrence of the old symbol in my previous patch. llvm-svn: 231398	2015-03-05 19:47:50 +00:00
Ahmed Bougacha	4200cc95b4	[ARM] Enable vector extload combine for legal types. This commit enables forming vector extloads for ARM. It only does so for legal types, and when we can't fold the extension in a wide/long form of the user instruction. Enabling it for larger types isn't as good an idea on ARM as it is on X86, because: - we pretend that extloads are legal, but end up generating vld+vmov - we have instructions like vld {dN, dM}, which can't be generated when we "manually expand" extloads to vld+vmov. For legal types, the combine doesn't fire that often: in the integration tests only in a big endian testcase, where it removes a pointless AND. Related to rdar://19723053 Differential Revision: http://reviews.llvm.org/D7423 llvm-svn: 231396	2015-03-05 19:37:53 +00:00
Zachary Turner	cd132c9b0d	Replace PrintStackTrace(FILE*) with PrintStackTrace(raw_ostream&) This will be followed by a change on the clang side to update the only user of this function with the new version. Differential Revision: http://reviews.llvm.org/D8074 Reviewed By: Reid Kleckner llvm-svn: 231392	2015-03-05 19:10:52 +00:00
Reid Kleckner	286b100750	Remove accidental errs() call in Verifier llvm-svn: 231391	2015-03-05 19:05:25 +00:00
Rafael Espindola	86bd6a1202	Use the generic Lfunc_begin label on ppc. This removes yet another custom label to mark the start of a function. llvm-svn: 231390	2015-03-05 18:55:50 +00:00
David Majnemer	71b9b6be1b	X86: Optimize address mode matching for FRAME_ALLOC_RECOVER nodes We know that the absolute symbol will be less than 2GB and thus will always fit. llvm-svn: 231389	2015-03-05 18:50:12 +00:00
Reid Kleckner	e658058cc0	Silence -Wmissing-braces warning from clang-cl The first element of STACKFRAME64 is a struct and Clang wants us to put braces around it's initialization. Instead, drop the zero. The result should be the same. llvm-svn: 231387	2015-03-05 18:26:58 +00:00
Reid Kleckner	cfb9ce53c1	Replace llvm.frameallocate with llvm.frameescape Turns out it's pretty straightforward and simplifies the implementation. Reviewers: andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D8051 llvm-svn: 231386	2015-03-05 18:26:34 +00:00
Erik Eckstein	8c76e669c5	Revert r231276 (including r231277): Add a lock() function in PassRegistry to speed up multi-thread synchronization. llvm-svn: 231385	2015-03-05 17:53:00 +00:00
Zachary Turner	62b7b617a8	[Windows] Implement PrintStackTrace(FILE) llvm::sys::PrintBacktrace(FILE) is supposed to print a backtrace of the current thread given the current PC. This function was unimplemented on Windows, and instead the only time we could print a backtrace was as the result of an exception through LLVMUnhandledExceptionFilter. This patch implements backtracing of self by using RtlCaptureContext to get a CONTEXT for the current thread, and moving the printing and StackWalk64 code to a common method that printing own stack trace and printing stack trace of an exception can use. Differential Revision: http://reviews.llvm.org/D8068 Reviewed by: Reid Kleckner llvm-svn: 231382	2015-03-05 17:47:52 +00:00
Simon Pilgrim	7189084bef	[DagCombiner] Allow shuffles to merge through bitcasts Currently shuffles may only be combined if they are of the same type, despite the fact that bitcasts are often introduced in between shuffle nodes (e.g. x86 shuffle type widening). This patch allows a single input shuffle to peek through bitcasts and if the input is another shuffle will merge them, shuffling using the smallest sized type, and re-applying the bitcasts at the inputs and output instead. Dropped old ShuffleToZext test - this patch removes the use of the zext and vector-zext.ll covers these anyhow. Differential Revision: http://reviews.llvm.org/D7939 llvm-svn: 231380	2015-03-05 17:14:04 +00:00
Kit Barton	e48b1e1c4f	While reviewing the changes to Clang to add builtin support for the vsld, vsrd, and vsrad instructions, it was pointed out that the builtins are generating the LLVM opcodes (shl, lshr, and ashr) not calls to the intrinsics. This patch changes the implementation of the vsld, vsrd, and vsrad instructions from from intrinsics to VXForm_1 instructions and makes them legal with P8 Altivec. It also removes the definition of the int_ppc_altivec_vsld, int_ppc_altivec_vsrd, and int_ppc_altivec_vsrad intrinsics. llvm-svn: 231378	2015-03-05 16:24:38 +00:00
Igor Laevsky	8d0851f509	Revert change r231366 as it broke clang-native-arm-cortex-a9 Analysis/properties.m test. llvm-svn: 231374	2015-03-05 15:41:14 +00:00
Elena Demikhovsky	de05f10de2	AVX-512, SKX: Enabled masked_load/store operations for this target. Added lowering for ISD::CONCAT_VECTORS and ISD::INSERT_SUBVECTOR for i1 vectors, it is needed to pass all masked_memop.ll tests for SKX. llvm-svn: 231371	2015-03-05 15:11:35 +00:00
Igor Laevsky	1725997f14	Teach lowering to correctly handle invoke statepoint and gc results tied to them. Note that we still can not lower gc.relocates for invoke statepoints. Also it extracts getCopyFromRegs helper function in SelectionDAGBuilder as we need to be able to customize type of the register exported from basic block during lowering of the gc.result. llvm-svn: 231366	2015-03-05 14:11:21 +00:00
Arnaud A. de Grandmaison	d8ed0d372c	[PBQP] Use a local bit-matrix to speedup searching an edge in the graph. Build time (user time) for building llvm+clang+lldb in release mode: - default allocator: 9086 seconds - with PBQP: 9126 seconds - with PBQP + local bit matrix cache: 9097 seconds llvm-svn: 231360	2015-03-05 09:12:59 +00:00
Michael Kuperstein	bcb26d6880	[InstCombine] Fix an assertion when fmul has a ConstantExpr operand isNormalFp and isFiniteNonZeroFp should not assume vector operands can not be constant expressions. Patch by Pawel Jurek <pawel.jurek@intel.com> Differential Revision: http://reviews.llvm.org/D8053 llvm-svn: 231359	2015-03-05 08:38:57 +00:00
Craig Topper	0ee8470a43	[X86] Use vmovss to handle inserting an element into index 0 of a v8f32 vector of zeros. llvm-svn: 231354	2015-03-05 06:38:42 +00:00
Frederic Riss	6e56345dbc	Remove useless break after return. Pointed out by Paul Robinson. llvm-svn: 231353	2015-03-05 06:13:39 +00:00
Hans Wennborg	6d8e6d5ee4	Revert r231324 "Remove the conditional addition of the execution dependency fixing" See PR22799. llvm-svn: 231348	2015-03-05 03:24:49 +00:00
Chandler Carruth	7a715dae05	[MBP] Use range based for-loops throughout this code. Several had already been added and the inconsistency made choosing names and changing code more annoying. Plus, wow are they better for this code! llvm-svn: 231347	2015-03-05 03:19:05 +00:00
Chandler Carruth	2fc3fe1282	[MBP] NFC, run clang-format over this code and tweak things to make the result reasonable. This code predated clang-format and so there was a reasonable amount of crufty formatting that had accumulated. This should ensure that neither myself nor others end up with formatting-only changes sneaking into other fixes. llvm-svn: 231341	2015-03-05 02:35:31 +00:00
Chandler Carruth	d0dced58ab	[MBP] This is no longer 'block-placement2'. ;] The old variants are long gone, update this code to reflect that. llvm-svn: 231340	2015-03-05 02:28:25 +00:00
Rafael Espindola	07c03d316d	Use the existing begin and end symbol for debug info. llvm-svn: 231338	2015-03-05 02:05:42 +00:00
NAKAMURA Takumi	478559a532	Reformat. llvm-svn: 231336	2015-03-05 01:25:19 +00:00
NAKAMURA Takumi	d8422ce0ec	Revert r231103, "FullDependenceAnalysis: Avoid using the (deprecated in C++11) copy ctor" It is miscompiled on msc18. llvm-svn: 231335	2015-03-05 01:25:12 +00:00
NAKAMURA Takumi	e110d641a0	Revert r231104, "unique_ptrify FullDependenceAnalysis::DV", to appease msc18 C2280. llvm-svn: 231334	2015-03-05 01:25:06 +00:00
Kostya Serebryany	83ce8779d5	[sanitizer] add nosanitize metadata to more coverage instrumentation instructions llvm-svn: 231333	2015-03-05 01:20:05 +00:00
Chandler Carruth	af7e99f2f4	[MBP] Revert r231238 which attempted to fix a nasty bug where MBP is just arbitrarily interleaving unrelated control flows once they get moved "out-of-line" (both outside of natural CFG ordering and with diamonds that cannot be fully laid out by chaining fallthrough edges). This easy solution doesn't work in practice, and it isn't just a small bug. It looks like a very different strategy will be required. I'm working on that now, and it'll again go behind some flag so that everyone can experiment and make sure it is working well for them. llvm-svn: 231332	2015-03-05 01:07:03 +00:00
NAKAMURA Takumi	8f49dd3687	ScalarEvolution.cpp: Appease g++-4.7. He missed implicit "this" in lambda. llvm-svn: 231331	2015-03-05 01:02:45 +00:00
Eric Christopher	385f4b36d8	Remove the conditional addition of the execution dependency fixing pass from the ARM backend as the pass itself will detect any use of the appropriate register class. llvm-svn: 231324	2015-03-05 00:28:55 +00:00
Eric Christopher	63b44882ef	Cleanup and remove a chunk of getARMSubtarget calls in the ARM TargetMachine pass pipeline construction by pushing them down into the appropriate pass. llvm-svn: 231323	2015-03-05 00:23:40 +00:00
Paul Robinson	49e38965dc	Turn off .debug_pubnames/pubtypes for PS4. Differential Revision: http://reviews.llvm.org/D8067 llvm-svn: 231322	2015-03-05 00:08:27 +00:00
Argyrios Kyrtzidis	dc8f979b41	[Support] Increase timeout for the LockFileManager back to 5 mins. Waiting for just 1 min may not be enough for some contexts. llvm-svn: 231309	2015-03-04 22:54:38 +00:00
Sanjoy Das	a5397c0198	[IndVarSimplify] use the "canonical" way to infer no-wrap. Summary: rL225282 introduced an ad-hoc way to promote some additions to nuw or nsw. Since then SCEV has become smarter in directly proving no-wrap; and using the canonical "ext(A op B) == ext(A) op ext(B)" method of proving no-wrap is just as powerful now. Rip out the existing complexity in favor of getting SCEV to do all the heaving lifting internally. This change does not add any unit tests because it is supposed to be a non-functional change. Tests added in rL225282 and rL226075 are valid tests for this change. Reviewers: atrick, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7981 llvm-svn: 231306	2015-03-04 22:24:23 +00:00
Sanjoy Das	9e2c5010f6	[SCEV] make SCEV smarter about proving no-wrap. Summary: Teach SCEV to prove no overflow for an add recurrence by proving something about the range of another add recurrence a loop-invariant distance away from it. Reviewers: atrick, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7980 llvm-svn: 231305	2015-03-04 22:24:17 +00:00
Frederic Riss	77f850e336	DWARFFormValue: Add getAsSignedConstant method. The implementation accepts explicitely signed forms (DW_FORM_sdata), but also unsigned forms as long as they fit in an int64_t. llvm-svn: 231299	2015-03-04 22:07:41 +00:00
Frederic Riss	ee17fb9b0e	Teach DIEInteger to emit FORM_strp and FORM_ref_addr attributes. To be used/tested by llvm-dsymutil. (llvm-dsymutil does a 'static' link, no need for relocations for most things, so it'll just emit raw integers for most attributes) llvm-svn: 231298	2015-03-04 22:07:36 +00:00
Rafael Espindola	266b8c8043	Expand variables when evaluating absolute expressions. This allows for variables to be used in .size. This matches gnu AS functionality. llvm-svn: 231295	2015-03-04 22:03:21 +00:00
Paul Robinson	78cc0821f0	Support standard DWARF TLS opcode; Darwin and PS4 use it. Differential Revision: http://reviews.llvm.org/D8018 llvm-svn: 231286	2015-03-04 20:55:11 +00:00
Nemanja Ivanovic	e8effe1edb	Add LLVM support for PPC cryptography builtins Review: http://reviews.llvm.org/D7955 llvm-svn: 231285	2015-03-04 20:44:33 +00:00
Reid Kleckner	4276945161	Try to satisfy sanitizer lint check llvm-svn: 231284	2015-03-04 20:38:59 +00:00
Erik Eckstein	8c38b8b873	Add a lock() function in PassRegistry to speed up multi-thread synchronization. When calling lock() after all passes are registered, the PassRegistry doesn't need a mutex anymore to look up passes. This speeds up multithreaded llvm execution by ~5% (tested with 4 threads). In an asserts build of llvm this has an even bigger impact. Note that it's not required to use the lock function. llvm-svn: 231276	2015-03-04 18:57:11 +00:00
Mehdi Amini	46a43556db	Make DataLayout Non-Optional in the Module Summary: DataLayout keeps the string used for its creation. As a side effect it is no longer needed in the Module. This is "almost" NFC, the string is no longer canonicalized, you can't rely on two "equals" DataLayout having the same string returned by getStringRepresentation(). Get rid of DataLayoutPass: the DataLayout is in the Module The DataLayout is "per-module", let's enforce this by not duplicating it more than necessary. One more step toward non-optionality of the DataLayout in the module. Make DataLayout Non-Optional in the Module Module->getDataLayout() will never returns nullptr anymore. Reviewers: echristo Subscribers: resistor, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D7992 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231270	2015-03-04 18:43:29 +00:00
Reid Kleckner	2ae03e1783	Revert "unique_ptrify ValID::ConstantStructElts" This reverts r231200 and r231204. The second one added an explicit move ctor for MSVC. This change broke the clang-cl self-host due to weirdness in MSVC's implementation of std::map::insert. Somehow we lost our rvalue ref-ness when going through variadic placement new: template <class _Objty, class... _Types> void construct(_Objty _Ptr, _Types &&... _Args) { // construct _Objty(_Types...) at _Ptr ::new ((void )_Ptr) _Objty(_STD forward<_Types>(_Args)...); } For some reason, Clang decided to call the deleted std::pair copy constructor at this point. Needs further investigation, once I can build. llvm-svn: 231269	2015-03-04 18:31:10 +00:00
Wei Mi	4d9347993b	Revert the test commit. llvm-svn: 231264	2015-03-04 17:44:22 +00:00
Wei Mi	20401eecd6	Test commit. It will be reverted in the next commit. llvm-svn: 231262	2015-03-04 17:41:17 +00:00
Adrian Prantl	0f61579602	Fix DwarfExpression::AddMachineRegExpression so it doesn't read past the end of an expression that ends with DW_OP_plus. Caught by the ASAN build bots. llvm-svn: 231260	2015-03-04 17:39:33 +00:00
Marek Olsak	d2af89df10	R600/SI: Add an intrinsic for S_FLBIT_I32 / V_FFBH_I32 Required by OpenGL (ARB_gpu_shader5). llvm-svn: 231259	2015-03-04 17:33:45 +00:00
Nemanja Ivanovic	d384cd9907	Test commit. Removed an unnecessary space llvm-svn: 231257	2015-03-04 17:09:12 +00:00
JF Bastien	f14889ee34	Mutate TargetLowering::shouldExpandAtomicRMWInIR to specifically dictate how AtomicRMWInsts are expanded. Summary: In PNaCl, most atomic instructions have their own @llvm.nacl.atomic.* function, each one, with a few exceptions, represents a consistent behaviour across all NaCl-supported targets. Unfortunately, the atomic RMW operations nand, [u]min, and [u]max aren't directly represented by any such @llvm.nacl.atomic.* function. This patch refines shouldExpandAtomicRMWInIR in TargetLowering so that a future `Le32TargetLowering` class can selectively inform the caller how the target desires the atomic RMW instruction to be expanded (ie via load-linked/store-conditional for ARM/AArch64, via cmpxchg for X86/others?, or not at all for Mips) if at all. This does not represent a behavioural change and as such no tests were added. Patch by: Richard Diamond. Reviewers: jfb Reviewed By: jfb Subscribers: jfb, aemerson, t.p.northover, llvm-commits Differential Revision: http://reviews.llvm.org/D7713 llvm-svn: 231250	2015-03-04 15:47:57 +00:00
Jozef Kolek	c925808ee5	[mips][microMIPS] Make usage of ADDU16 and SUBU16 by code generator Differential Revision: http://reviews.llvm.org/D7609 llvm-svn: 231249	2015-03-04 15:47:42 +00:00
Bill Schmidt	d90aff2c4f	[PowerPC] Remove unnecessary and incomplete commentary This "itinerary class map" in PPCSchedule.td is incomplete and redundant with the actual code. As it provides no value, we've decided to remove it. No functional change. llvm-svn: 231246	2015-03-04 14:56:05 +00:00
Andrea Di Biagio	df93ccf49a	[X86][FastISel] Simplify the logic in method X86SelectSIToFP. The target-independent selection algorithm in FastISel already knows how to select a SINT_TO_FP if the target is SSE but not AVX. On targets that have SSE but not AVX, the tablegen'd 'fastEmit' functions for ISD::SINT_TO_FP know how to select instruction X86::CVTSI2SSrr (for an i32 to f32 conversion) and X86::CVTSI2SDrr (for an i32 to f64 conversion). This patch simplifies the logic in method X86SelectSIToFP knowing that the code would not be reachable if the subtarget doesn't have AVX. No functional change intended. llvm-svn: 231243	2015-03-04 14:23:25 +00:00
Dmitry Vyukov	b37b95ed3e	asan: do not instrument direct inbounds accesses to stack variables Do not instrument direct accesses to stack variables that can be proven to be inbounds, e.g. accesses to fields of structs on stack. But it eliminates 33% of instrumentation on webrtc/modules_unittests (number of memory accesses goes down from 290152 to 193998) and reduces binary size by 15% (from 74M to 64M) and improved compilation time by 6-12%. The optimization is guarded by asan-opt-stack flag that is off by default. http://reviews.llvm.org/D7583 llvm-svn: 231241	2015-03-04 13:27:53 +00:00
Toma Tabacu	e1e3ffe71d	[mips] Rename the LA/LI/DLI TableGen definitions and classes. NFC. Summary: Use more reasonable names for these pseudo-instructions. As there's only one definition tied to any one of these classes, I named them with abbreviated versions of their respective class' name. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7831 llvm-svn: 231240	2015-03-04 13:01:14 +00:00
Vasileios Kalintiris	8761490d2e	[mips] Keep the parameter list of Filler::searchRange() consistent. NFC. Summary: Move the "Filler" parameter to the end of the parameter list as it is, conceptually, the only output parameter of that function. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7726 llvm-svn: 231239	2015-03-04 12:37:58 +00:00
Chandler Carruth	9a53fbe243	[MBP] Fix a really horrible bug in MachineBlockPlacement, but behind a flag for now. First off, thanks to Daniel Jasper for really pointing out the issue here. It's been here forever (at least, I think it was there when I first wrote this code) without getting really noticed or fixed. The key problem is what happens when two reasonably common patterns happen at the same time: we outline multiple cold regions of code, and those regions in turn have diamonds or other CFGs for which we can't just topologically lay them out. Consider some C code that looks like: if (a1()) { if (b1()) c1(); else d1(); f1(); } if (a2()) { if (b2()) c2(); else d2(); f2(); } done(); Now consider the case where a1() and a2() are unlikely to be true. In that case, we might lay out the first part of the function like: a1, a2, done; And then we will be out of successors in which to build the chain. We go to find the best block to continue the chain with, which is perfectly reasonable here, and find "b1" let's say. Laying out successors gets us to: a1, a2, done; b1, c1; At this point, we will refuse to lay out the successor to c1 (f1) because there are still un-placed predecessors of f1 and we want to try to preserve the CFG structure. So we go get the next best block, d1. ... wait for it ... Except that the next best block isn't d1. It is b2! d1 is waaay down inside these conditionals. It is much less important than b2. Except that this is exactly what we didn't want. If we keep going we get the entire set of the rest of the CFG interleaved!!! a1, a2, done; b1, c1; b2, c2; d1, f1; d2, f2; So we clearly need a better strategy here. =] My current favorite strategy is to actually try to place the block whose predecessor is closest. This very simply ensures that we unwind these kinds of CFGs the way that is natural and fitting, and should minimize the number of cache lines instructions are spread across. It also happens to be dead simple. It's like the datastructure was specifically set up for this use case or something. We only push blocks onto the work list when the last predecessor for them is placed into the chain. So the back of the worklist is the nearest next block. Unfortunately, a change like this is going to cause soooo many benchmarks to swing wildly. So for now I'm adding this under a flag so that we and others can validate that this is fixing the problems described, that it seems possible to enable, and hopefully that it fixes more of our problems long term. llvm-svn: 231238	2015-03-04 12:18:08 +00:00
Vasileios Kalintiris	2ef2888273	[mips] Specify the correct value type when combining a CMovFP node. This commit fixes a bug introduced in r230956 where we were creating CMovFP_{T,F} nodes with multiple return value types (one for each operand). With this change the return value type of the new node is the same as the value type of the True/False operands of the original node. llvm-svn: 231237	2015-03-04 12:10:18 +00:00
Daniel Jasper	471e856f49	Add a flag to experiment with outlining optional branches. In a CFG with the edges A->B->C and A->C, B is an optional branch. LLVM's default behavior is to lay the blocks out naturally, i.e. A, B, C, in order to improve code locality and fallthroughs. However, if a function contains many of those optional branches only a few of which are taken, this leads to a lot of unnecessary icache misses. Moving B out of line can work around this. Review: http://reviews.llvm.org/D7719 llvm-svn: 231230	2015-03-04 11:05:34 +00:00
Kristof Beyls	aea8461820	Fix PR22408 - LLVM producing AArch64 TLS relocations that GNU linkers cannot handle yet. As is described at http://llvm.org/bugs/show_bug.cgi?id=22408, the GNU linkers ld.bfd and ld.gold currently only support a subset of the whole range of AArch64 ELF TLS relocations. Furthermore, they assume that some of the code sequences to access thread-local variables are produced in a very specific sequence. When the sequence is not as the linker expects, it can silently mis-relaxe/mis-optimize the instructions. Even if that wouldn't be the case, it's good to produce the exact sequence, as that ensures that linkers can perform optimizing relaxations. This patch: * implements support for 16MiB TLS area size instead of 4GiB TLS area size. Ideally clang would grow an -mtls-size option to allow support for both, but that's not part of this patch. * by default doesn't produce local dynamic access patterns, as even modern ld.bfd and ld.gold linkers do not support the associated relocations. An option (-aarch64-elf-ldtls-generation) is added to enable generation of local dynamic code sequence, but is off by default. * makes sure that the exact expected code sequence for local dynamic and general dynamic accesses is produced, by making use of a new pseudo instruction. The patch also removes two (AArch64ISD::TLSDESC_BLR, AArch64ISD::TLSDESC_CALL) pre-existing AArch64-specific pseudo SDNode instructions that are superseded by the new one (TLSDESC_CALLSEQ). llvm-svn: 231227	2015-03-04 09:12:08 +00:00
Michael Kuperstein	fb95697c88	[DAGCombine] Fix a bug in a BUILD_VECTOR combine When trying to convert a BUILD_VECTOR into a shuffle, we try to split a single source vector that is twice as wide as the destination vector. We can not do this when we also need the zero vector to create a blend. This fixes PR22774. Differential Revision: http://reviews.llvm.org/D8040 llvm-svn: 231219	2015-03-04 07:27:39 +00:00
Davide Italiano	fcae934c03	[MC][Target] Implement support for R_X86_64_SIZE{32,64}. Differential Revision: D7990 Reviewed by: rafael, majnemer llvm-svn: 231216	2015-03-04 06:49:39 +00:00
Zachary Turner	653236596a	[llvm-pdbdump] Display full enum definitions. This will now display enum definitions both at the global scope as well as nested inside of classes. Additionally, it will no longer display enums at the global scope if the enum is nested. Instead, it will omit the definition of the enum globally and instead emit it in the corresponding class definition. llvm-svn: 231215	2015-03-04 06:09:53 +00:00
Frederic Riss	9412d63f68	Move emitDIE and emitAbbrevs to AsmPrinter. NFC. (They are called emitDwarfDIE and emitDwarfAbbrevs in their new home) llvm-dsymutil wants to reuse that code, but it doesn't have a DwarfUnit or a DwarfDebug object to call those. It has access to an AsmPrinter though. Having emitDIE in the AsmPrinter also removes the DwarfFile dependency on DwarfDebug, and thus the patch drops that field. Differential Revision: http://reviews.llvm.org/D8024 llvm-svn: 231210	2015-03-04 02:30:17 +00:00
Frederic Riss	cd04434cd5	Constify AsmPrinter passed to DIE methods. llvm-svn: 231209	2015-03-04 02:30:08 +00:00
David Blaikie	f22e370733	Workaround MSVC not providing implicit move members llvm-svn: 231204	2015-03-04 02:07:51 +00:00
Mehdi Amini	367bfa42d8	Use report_fatal_error instead of unreachable for -fast-isel-abort Suggestion by Andrea Di Biagio From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231201	2015-03-04 01:48:39 +00:00
David Blaikie	0afee85176	unique_ptrify ValID::ConstantStructElts llvm-svn: 231200	2015-03-04 01:41:01 +00:00
David Blaikie	b9cc659efe	LLParser: Avoid copying ValIDs, the copy ctor is deprecated in C++11 due to the presence of a user-declared dtor llvm-svn: 231199	2015-03-04 01:40:07 +00:00
Rafael Espindola	310e4b592f	Use the vanilla func_end symbol for .size. No need to create yet another temp symbol. llvm-svn: 231198	2015-03-04 01:35:23 +00:00
Pete Cooper	9f5fe4a11f	Remove MCStreamer include which isn't used here. NFC llvm-svn: 231195	2015-03-04 01:24:26 +00:00
Pete Cooper	885bd8a2c4	This file should always have included MCAssembler and not MCStreamer. NFC llvm-svn: 231194	2015-03-04 01:24:24 +00:00
Pete Cooper	ef21bd444d	Remove MCStreamer.h include from MCContext.h and explictly include it where necessary. NFC llvm-svn: 231193	2015-03-04 01:24:11 +00:00
David Blaikie	ed40025f37	Recommit r231168: unique_ptrify LiveRange::segmentSet GCC 4.7's libstdc++ doesn't have std::map::emplace, but it does have std::unordered_map::emplace, and the use case here doesn't appear to need ordering. The container has been changed in a separate/precursor patch, and now this patch should hopefully build cleanly even with GCC 4.7. & then I realized the order of the container did matter, so extra handling of ordering was added in r231189. Original commit message: This makes LiveRange non-copyable, and LiveInterval is already non-movable (due to the explicit dtor), so now it's non-copyable and non-movable. Fix the one case where we were relying on the (deprecated in C++11) implicit copy ctor of LiveInterval (which happened to work because the ctor created an object with a null segmentSet, so double-deleting the null pointer was fine). llvm-svn: 231192	2015-03-04 01:20:33 +00:00
David Blaikie	55c6222538	Recommit r231175: Change LiveStackAnalysis::SS2IntervalMap from std::map to std::unordered_map The order of this container was needed at one point - so, at that point create a temporary array of pointers, sort those, then iterate them. This keeps lookup efficient (& the lesser issue, of allowing the use of emplace... ), object identity preserved, and ordered iteration in the one place that requires it. While this has no functional change, I realize it does mean allocating an extra data structure and performing a sort - so if this looks suspect to anyone regarding perf characteristics, I'm all ears. llvm-svn: 231189	2015-03-04 01:15:53 +00:00
Matthias Braun	9f0c91f0d9	RegisterCoalescer: Gracefully continue if subrange merging fails. There is a known bug where the register coalescer fails to merge subranges when multiple ranges end up in the "overflow" bit 32 of the lanemasks. A proper fix for this is complicated so for now this is a workaround which lets the register coalescer drop the subregister liveness information (we just loose some precision by that) and continue. llvm-svn: 231186	2015-03-04 00:43:50 +00:00
Rafael Espindola	0ac5075f31	Drop the "eh_" from eh_func_begin and eh_func_end. They will be used for more than eh tables. llvm-svn: 231185	2015-03-04 00:27:43 +00:00
David Blaikie	90c59ccae6	Revert "unique_ptrify LiveRange::segmentSet" Apparently something does care about ordering of LiveIntervals... so revert all that stuff (r231175, r231176, r231177) & take some time to re-evaluate. llvm-svn: 231184	2015-03-04 00:15:02 +00:00
Philip Reames	6da37857d1	[RewriteStatepointsForGC] Fix a relocation bug w.r.t values defined by invoke instructions RewriteStatepointsForGC pass emits an alloca for each GC pointer which will be relocated. It then inserts stores after def and all relocations, and inserts loads before each use as well. In the end, mem2reg is used to update IR with relocations in SSA form. However, there is a problem with inserting stores for values defined by invoke instructions. The code didn't expect a def was a terminator instruction, and inserting instructions after these terminators resulted in malformed IR. This patch fixes this problem by handling invoke instructions as a special case. If the def is an invoke instruction, the store will be inserted at the beginning of the normal destination block. Since return value from invoke instruction does not dominate the unwind destination block, no action is needed there. Patch by: Chen Li Differential Revision: http://reviews.llvm.org/D7923 llvm-svn: 231183	2015-03-04 00:13:52 +00:00
Juergen Ributzka	1f7a17661c	Remove 'llvm.x86.avx2.vbroadcasti128' intrinsic. The intrinsic is no longer generated by the front-end. Remove the intrinsic and auto-upgrade it to a vector shuffle. Reviewed by Nadav This is related to rdar://problem/18742778. llvm-svn: 231182	2015-03-04 00:13:25 +00:00
David Blaikie	19660f03be	Recommit r231168: unique_ptrify LiveRange::segmentSet GCC 4.7's libstdc++ doesn't have std::map::emplace, but it does have std::unordered_map::emplace, and the use case here doesn't appear to need ordering. The container has been changed in a separate/precursor patch, and now this patch should hopefully build cleanly even with GCC 4.7. Original commit message: This makes LiveRange non-copyable, and LiveInterval is already non-movable (due to the explicit dtor), so now it's non-copyable and non-movable. Fix the one case where we were relying on the (deprecated in C++11) implicit copy ctor of LiveInterval (which happened to work because the ctor created an object with a null segmentSet, so double-deleting the null pointer was fine). llvm-svn: 231176	2015-03-03 23:53:03 +00:00
David Blaikie	923a25e957	Revert "unique_ptrify LiveRange::segmentSet" GCC 4.7 shakes fist (doesn't have std::map::emplace... ) This reverts commit r231168. llvm-svn: 231173	2015-03-03 23:44:07 +00:00
Jan Wen Voung	cd3d25a25f	Move TargetLibraryInfo data from two files into one common .def file. Summary: This makes it more obvious that the enum definition and the "StandardName" array is in sync. Mechanically refactored w/ a python script. Test Plan: still compiles Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7845 llvm-svn: 231172	2015-03-03 23:41:58 +00:00
David Blaikie	5a0206a3ff	unique_ptrify LiveRange::segmentSet This makes LiveRange non-copyable, and LiveInterval is already non-movable (due to the explicit dtor), so now it's non-copyable and non-movable. Fix the one case where we were relying on the (deprecated in C++11) implicit copy ctor of LiveInterval (which happened to work because the ctor created an object with a null segmentSet, so double-deleting the null pointer was fine). llvm-svn: 231168	2015-03-03 23:30:40 +00:00
Kostya Serebryany	be5e0ed919	[sanitizer/coverage] Add AFL-style coverage counters (search heuristic for fuzzing). Introduce -mllvm -sanitizer-coverage-8bit-counters=1 which adds imprecise thread-unfriendly 8-bit coverage counters. The run-time library maps these 8-bit counters to 8-bit bitsets in the same way AFL (http://lcamtuf.coredump.cx/afl/technical_details.txt) does: counter values are divided into 8 ranges and based on the counter value one of the bits in the bitset is set. The AFL ranges are used here: 1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+. These counters provide a search heuristic for single-threaded coverage-guided fuzzers, we do not expect them to be useful for other purposes. Depending on the value of -fsanitize-coverage=[123] flag, these counters will be added to the function entry blocks (=1), every basic block (=2), or every edge (=3). Use these counters as an optional search heuristic in the Fuzzer library. Add a test where this heuristic is critical. llvm-svn: 231166	2015-03-03 23:27:02 +00:00
Eric Christopher	6f1e5680f6	Remove subtarget dependence in pass pipeline setup for AArch64. llvm-svn: 231165	2015-03-03 23:22:40 +00:00
Reid Kleckner	423665311d	WinEH: Remove vestigial EH object Ultimately, we'll need to leave something behind to indicate which alloca will hold the exception, but we can figure that out when it comes time to emit the __CxxFrameHandler3 catch handler table. llvm-svn: 231164	2015-03-03 23:20:30 +00:00
David Majnemer	1bacc0abc9	InstCombine: Ensure select condition types are identical before merging Selection conditions may be vectors or scalars. Make sure InstCombine doesn't indiscriminately assume that a select which is value dependent on another select have identical select condition types. This fixes PR22773. llvm-svn: 231156	2015-03-03 22:40:36 +00:00
David Blaikie	01a9a412a7	Avoid copying LiveInterval, this could lead to a double-delete llvm-svn: 231154	2015-03-03 22:25:48 +00:00
Eric Christopher	2891913f1a	Fix a problem where the TwoAddressInstructionPass which generate redundant register moves in a loop. From: int M, total; void foo() { int i; for (i = 0; i < M; i++) { total = total + i / 2; } } This is the kernel loop: .LBB0_2: # %for.body =>This Inner Loop Header: Depth=1 movl %edx, %esi movl %ecx, %edx shrl $31, %edx addl %ecx, %edx sarl %edx addl %esi, %edx incl %ecx cmpl %eax, %ecx jl .LBB0_2 -------------------------- The first mov insn "movl %edx, %esi" could be removed if we change "addl %esi, %edx" to "addl %edx, %esi". The IR before TwoAddressInstructionPass is: BB#2: derived from LLVM BB %for.body Predecessors according to CFG: BB#1 BB#2 %vreg3<def> = COPY %vreg12<kill>; GR32:%vreg3,%vreg12 %vreg2<def> = COPY %vreg11<kill>; GR32:%vreg2,%vreg11 %vreg7<def,tied1> = SHR32ri %vreg3<tied0>, 31, %EFLAGS<imp-def,dead>; GR32:%vreg7,%vreg3 %vreg8<def,tied1> = ADD32rr %vreg3<tied0>, %vreg7<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg8,%vreg3,%vreg7 %vreg9<def,tied1> = SAR32r1 %vreg8<kill,tied0>, %EFLAGS<imp-def,dead>; GR32:%vreg9,%vreg8 %vreg4<def,tied1> = ADD32rr %vreg9<kill,tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg4,%vreg9,%vreg2 %vreg5<def,tied1> = INC64_32r %vreg3<kill,tied0>, %EFLAGS<imp-def,dead>; GR32:%vreg5,%vreg3 CMP32rr %vreg5, %vreg0, %EFLAGS<imp-def>; GR32:%vreg5,%vreg0 %vreg11<def> = COPY %vreg4; GR32:%vreg11,%vreg4 %vreg12<def> = COPY %vreg5<kill>; GR32:%vreg12,%vreg5 JL_4 <BB#2>, %EFLAGS<imp-use,kill> Now TwoAddressInstructionPass will choose vreg9 to be tied with vreg4. However, it doesn't see that there is copy from vreg4 to vreg11 and another copy from vreg11 to vreg2 inside the loop body. To remove those copies, it is necessary to choose vreg2 to be tied with vreg4 instead of vreg9. This code pattern commonly appears when there is reduction operation in a loop. So check for a reversed copy chain and if we encounter one then we can commute the add instruction so we can avoid a copy. Patch by Wei Mi. http://reviews.llvm.org/D7806 llvm-svn: 231148	2015-03-03 22:03:03 +00:00
Mehdi Amini	9a9738f6e5	Remove getDataLayout() from Instruction/GlobalValue/BasicBlock/Function Summary: This does not conceptually belongs here. Instead provide a shortcut getModule() that provides access to the DataLayout. Reviewers: chandlerc, echristo Reviewed By: echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8027 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231147	2015-03-03 22:01:13 +00:00
David Blaikie	49cfb81665	DAGCombiner::LoadedSlice: Remove explicit copy ctor in favor of the Rule of Zero This way, the copy assignment operator can be used without hitting the deprecated case in C++11. llvm-svn: 231144	2015-03-03 21:50:47 +00:00
David Blaikie	9469072367	RewriteStatepointsForGC::PhiState: Remove explicit copy ctor in favor of the Rule of Zero The assertion was just checking a class invariant that's pretty easy to verify by inspection (no mutating operations, and the two non-copy ctors already ensure the state is maintained) so remove the explicit copy ctor in favor of the default, thus allowing the use of the default copy assignment operator without hitting the C++11 deprecation here. llvm-svn: 231143	2015-03-03 21:49:07 +00:00
Nadav Rotem	029c5c7fdb	Teach ComputeNumSignBits about signed divisions. http://reviews.llvm.org/D8028 rdar://20023136 llvm-svn: 231140	2015-03-03 21:39:02 +00:00
David Blaikie	7f1e0565b3	Revert "Remove the explicit SDNodeIterator::operator= in favor of the implicit default" Accidentally committed a few more of these cleanup changes than intended. Still breaking these out & tidying them up. This reverts commit r231135. llvm-svn: 231136	2015-03-03 21:18:16 +00:00
David Blaikie	bb8da4c08f	Remove the explicit SDNodeIterator::operator= in favor of the implicit default There doesn't seem to be any need to assert that iterator assignment is between iterators over the same node - if you want to reuse an iterator variable to iterate another node, that's perfectly acceptable. Just don't mix comparisons between iterators into disjoint sequences, as usual. llvm-svn: 231135	2015-03-03 21:17:08 +00:00
David Blaikie	0ef4488df2	Remove LatencyPriorityQueue::dump because it relies on an implicit copy ctor which is deprecated in C++11 (due to the presence of a user-declare dtor in the base class) This type could be made copyable (= default a protected copy ctor in the base class, and preferably make the derived class final to avoid risks of providing a slicing copy operation to further derived classes) but it seemed easier to avoid that complexity for a dump function that I assume (by symmetry with ResourcePriorityQueue's dump, which was actively buggy) not often used. llvm-svn: 231133	2015-03-03 21:16:56 +00:00
Paul Robinson	06a8eb8343	[X86][ELF] Correct relocation for DWARF TLS references Previously we had only Linux using DTPOFF for these; all X86 ELF targets should. Fixes a side issue mentioned in PR21077. Differential Revision: http://reviews.llvm.org/D8011 llvm-svn: 231130	2015-03-03 21:01:27 +00:00
Sanjay Patel	36a2dc895f	remove enum value names from comments; NFC llvm-svn: 231129	2015-03-03 20:58:35 +00:00
David Blaikie	0a756e6ad5	unique_ptrify ResourcePriorityQueue::ResourceModel llvm-svn: 231127	2015-03-03 20:49:08 +00:00
David Blaikie	b8cd65c5a2	Remove ResourcePriorityQueue::dump as it relies on copying a non-copyable type which would result in a double-delete llvm-svn: 231126	2015-03-03 20:49:05 +00:00
Sanjay Patel	948602bd17	use bool operator shortcut; NFC llvm-svn: 231123	2015-03-03 20:41:27 +00:00
Andrew Kaylor	e07b2a06d3	Fixing problem with field initialization order llvm-svn: 231122	2015-03-03 20:22:09 +00:00
Adrian Prantl	b283815a30	Fix PR22762. When emitting a DWARF expression check whether this is the frame register before checking if there is a DWARF register number for it. Thanks to H.J. Lu for diagnosing this and providing the testcase! llvm-svn: 231121	2015-03-03 20:12:52 +00:00
Andrew Kaylor	f0f5e46e07	Outline cleanup handlers for native Windows C++ exception handling Differential Revision: http://reviews.llvm.org/D7865 llvm-svn: 231117	2015-03-03 20:00:16 +00:00
Kit Barton	0cfa7b7ad0	Add the following 64-bit vector integer arithmetic instructions added in POWER8: vaddudm vsubudm vmulesw vmulosw vmuleuw vmulouw vmuluwm vmaxsd vmaxud vminsd vminud vcmpequd vcmpequd. vcmpgtsd vcmpgtsd. vcmpgtud vcmpgtud. vrld vsld vsrd vsrad Phabricator review: http://reviews.llvm.org/D7959 llvm-svn: 231115	2015-03-03 19:55:45 +00:00
David Blaikie	ca199cbf9b	Remove explicit no-op dtor in favor of the implicit dtor so as not to disable/deprecate the copy operations. llvm-svn: 231113	2015-03-03 19:53:02 +00:00
Eric Christopher	720ab84ba2	Add a comment above findRepresentativeClass explaining why it's where it is so that future generations can understand. llvm-svn: 231111	2015-03-03 19:47:14 +00:00
David Blaikie	5b240485b7	unique_ptrify FullDependenceAnalysis::DV Making this type a little harder to abuse (see workaround relating to use of the implicit copy ctor in the prior commit) llvm-svn: 231104	2015-03-03 19:20:18 +00:00
David Blaikie	c5771c214e	FullDependenceAnalysis: Avoid using the (deprecated in C++11) copy ctor llvm-svn: 231103	2015-03-03 19:20:16 +00:00
Dario Domizioli	5f7008a688	Fix PR22750: non-determinism causes assertion failure in DWARF generation The cause of the issue is the interaction of two factors: 1) When generating a DW_TAG_imported_declaration DIE which imports another imported declaration, the code in AsmPrinter/DwarfCompileUnit.cpp asserts that the second imported declaration must already have a DIE. 2) There is a non-determinism in the order in which imported declarations within the same scope are processed. Because of the non-determinism (2), it is possible that an imported declaration is processed before another one it depends on, breaking the assumption in (1). The source of the non-determinism is that the imported declaration DIDescriptors are sorted by scope in DwarfDebug::beginModule(); however that sort is not a stable_sort, therefore the order of the declarations within the same scope is not preserved. The attached patch changes the std::sort to a std::stable_sort and it fixes the problem. Test omitted due to it being non-deterministic and depending on the implementation of std::sort. llvm-svn: 231100	2015-03-03 18:40:53 +00:00
Dan Albert	675cffcb91	Make Triple::getOSVersion make sense for Android. Reviewers: srhines Reviewed By: srhines Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7928 llvm-svn: 231090	2015-03-03 18:23:51 +00:00
Eric Christopher	43768e311c	80-column fixup. llvm-svn: 231088	2015-03-03 17:54:39 +00:00
Chad Rosier	8e38f30e49	[AArch64] When combining constant mul of -3, prefer (sub x, (shl x, N)). This change only effects codegen when the constant is -3. llvm-svn: 231085	2015-03-03 17:31:01 +00:00
Duncan P. N. Exon Smith	e274180f0e	DebugInfo: Move new hierarchy into place Move the specialized metadata nodes for the new debug info hierarchy into place, finishing off PR22464. I've done bootstraps (and all that) and I'm confident this commit is NFC as far as DWARF output is concerned. Let me know if I'm wrong :). The code changes are fairly mechanical: - Bumped the "Debug Info Version". - `DIBuilder` now creates the appropriate subclass of `MDNode`. - Subclasses of DIDescriptor now expect to hold their "MD" counterparts (e.g., `DIBasicType` expects `MDBasicType`). - Deleted a ton of dead code in `AsmWriter.cpp` and `DebugInfo.cpp` for printing comments. - Big update to LangRef to describe the nodes in the new hierarchy. Feel free to make it better. Testcase changes are enormous. There's an accompanying clang commit on its way. If you have out-of-tree debug info testcases, I just broke your build. - `upgrade-specialized-nodes.sh` is attached to PR22564. I used it to update all the IR testcases. - Unfortunately I failed to find way to script the updates to CHECK lines, so I updated all of these by hand. This was fairly painful, since the old CHECKs are difficult to reason about. That's one of the benefits of the new hierarchy. This work isn't quite finished, BTW. The `DIDescriptor` subclasses are almost empty wrappers, but not quite: they still have loose casting checks (see the `RETURN_FROM_RAW()` macro). Once they're completely gutted, I'll rename the "MD" classes to "DI" and kill the wrappers. I also expect to make a few schema changes now that it's easier to reason about everything. llvm-svn: 231082	2015-03-03 17:24:31 +00:00
Michael Kuperstein	84dff4c94c	[X86][Haswell][SchedModel] Fix patterns for scalar FMA3 variants. llvm-svn: 231073	2015-03-03 15:47:02 +00:00
Elena Demikhovsky	d207f17fa1	AVX-512: Moved patterns for masked load/store under avx_store, avx_load classes. No functional changes. llvm-svn: 231069	2015-03-03 15:03:35 +00:00
Daniel Jasper	8f239f83b0	During PHI elimination, split critical edges that move copies out of loops. This prevents the behavior observed in llvm.org/PR22369. I am not sure whether I am reading the code correctly, but the early exit based on isLiveOutPastPHIs() seems to make the wrong assumption that RegisterCoalescer won't be able to coalesce those copies later. This change hides the new behavior behind -no-phi-elim-live-out-early-exit as it currently breaks four tests: * Assertion in: CodeGen/Hexagon/hwloop-cleanup.ll * Worse code in: CodeGen/X86/coalescer-commute4.ll CodeGen/X86/phys_subreg_coalesce-2.ll CodeGen/X86/zlib-longest-match.ll The root cause here seems to be that the heuristic that determines the visitation order in RegisterCoalescer gets less lucky. llvm-svn: 231064	2015-03-03 10:23:11 +00:00
Owen Anderson	7325b91783	Cleanup after r230934 per Dave's suggestions. llvm-svn: 231056	2015-03-03 05:39:27 +00:00
Craig Topper	ef04b2b505	[X86] Remove some unused code from disassembler. llvm-svn: 231055	2015-03-03 05:24:03 +00:00
Ahmed Bougacha	afbd6887c4	[X86] Special-case 2x CMOV when custom-inserting. This lets us avoid a few copies that are otherwise hard to get rid of. The way this is done is, the custom-inserter looks at the following instruction for another CMOV, and replaces both at the same time. A previous version used a new CMOV2 opcode, but the custom inserter is expected to be able to return a different basic block anyway, which means it's OK - though far from ideal - to alter that block's contents. Explicitly document that, in case it ever makes a difference. Alternatives welcome! Follow-up to r231045. rdar://19767934 Closes http://reviews.llvm.org/D8019 llvm-svn: 231046	2015-03-03 01:21:16 +00:00
Ahmed Bougacha	066d0b8e64	[X86] Combine (cmov (and/or (setcc) (setcc))) into (cmov (cmov)). Fold and/or of setcc's to double CMOV: (CMOV F, T, ((cc1 \| cc2) != 0)) -> (CMOV (CMOV F, T, cc1), T, cc2) (CMOV F, T, ((cc1 & cc2) != 0)) -> (CMOV (CMOV T, F, !cc1), F, !cc2) When we can't use the CMOV instruction, it might increase branch mispredicts. When we can, or when there is no mispredict, this improves throughput and reduces register pressure. These can't be catched by generic combines, because the pattern can appear when legalizing some instructions (such as fcmp une). rdar://19767934 http://reviews.llvm.org/D7634 llvm-svn: 231045	2015-03-03 01:09:14 +00:00
Peter Collingbourne	da2dbf21a9	LowerBitSets: Use byte arrays instead of bit sets to represent in-memory bit sets. By loading from indexed offsets into a byte array and applying a mask, a program can test bits from the bit set with a relatively short instruction sequence. For example, suppose we have 15 bit sets to lay out: A (16 bits), B (15 bits), C (14 bits), D (13 bits), E (12 bits), F (11 bits), G (10 bits), H (9 bits), I (7 bits), J (6 bits), K (5 bits), L (4 bits), M (3 bits), N (2 bits), O (1 bit) These bits can be laid out in a 16-byte array like this: Byte Offset 0123456789ABCDEF Bit 7 HHHHHHHHHIIIIIII 6 GGGGGGGGGGJJJJJJ 5 FFFFFFFFFFFKKKKK 4 EEEEEEEEEEEELLLL 3 DDDDDDDDDDDDDMMM 2 CCCCCCCCCCCCCCNN 1 BBBBBBBBBBBBBBBO 0 AAAAAAAAAAAAAAAA For example, to test bit X of A, we evaluate ((bits[X] & 1) != 0), or to test bit X of I, we evaluate ((bits[9 + X] & 0x80) != 0). This can be done in 1-2 machine instructions on x86, or 4-6 instructions on ARM. This uses the LPT multiprocessor scheduling algorithm to lay out the bits efficiently. Saves ~450KB of instructions in a recent build of Chromium. Differential Revision: http://reviews.llvm.org/D7954 llvm-svn: 231043	2015-03-03 00:49:28 +00:00
Andrew Kaylor	72029c6f2f	Remap arguments and non-alloca values used by outlined C++ exception handlers. Differential Revision: http://reviews.llvm.org/D7844 llvm-svn: 231042	2015-03-03 00:41:03 +00:00
Benjamin Kramer	838752d3f6	LoopIdiom: Give globals for memset_pattern16 private linkage. There's really no reason to have them have entries in the symbol table anymore. Old versions of ld64 had some bugs in this area but those have been fixed long ago. llvm-svn: 231041	2015-03-03 00:17:09 +00:00
Michael Zolotukhin	21abdf983a	TLI: Factor out sanitizeFunctionName. NFC. llvm-svn: 231034	2015-03-02 23:24:40 +00:00
Adrian Prantl	b846acc6c6	Revert "Revert "For the dwarf expression code get the subtarget off of the current"" This reapplies r230990 without modifications. llvm-svn: 231024	2015-03-02 22:02:36 +00:00
Adrian Prantl	92da14b244	Refactor DebugLocDWARFExpression so it doesn't require access to the TargetRegisterInfo. DebugLocEntry now holds a buffer with the raw bytes of the pre-calculated DWARF expression. Ought to be NFC, but it does slightly alter the output format of the textual assembly. This reapplies 230930 without the assertion in DebugLocEntry::finalize() because not all Machine registers can be lowered into DWARF register numbers and floating point constants cannot be expressed. llvm-svn: 231023	2015-03-02 22:02:33 +00:00
Sanjoy Das	2d38031271	Revert some changes that were made to fix PR20680. This re-lands change r230921. r230921 was reverted because it broke a clang test; a checkin fixing the clang test will be commited shortly. Summary: As far as I can tell, the real bug causing the issue was fixed in r230533. SCEVExpander should mark an increment operation as nuw or nsw only if it can prove that the operation does not overflow. There shouldn't be any situation where we have to do something different because of no-wrap flags generated by SCEVExpander. Revert "IndVarSimplify: Allow LFTR to fire more often" This reverts commit 1ade0f0faa98877b688e0b9da58e876052c1e04e (SVN: 222213). Revert "IndVarSimplify: Don't let LFTR compare against a poison value" This reverts commit c0f2b8b528d8a37b0a1522aae90af649d6357eb5 (SVN: 217102). Reviewers: majnemer, atrick, spatel Differential Revision: http://reviews.llvm.org/D7979 llvm-svn: 231018	2015-03-02 21:41:07 +00:00
Rui Ueyama	3206b79d53	Use read{16,32,64}{le,be}() instead of *reinterpret_cast<u{little,big}{16,32,64}_t>(). llvm-svn: 231016	2015-03-02 21:19:12 +00:00
Michael Zolotukhin	d3b76a3b01	TLI: Use lambda. NFC. llvm-svn: 231011	2015-03-02 20:50:08 +00:00
Michael Zolotukhin	9302236680	Make ToVectorTy static. llvm-svn: 231007	2015-03-02 20:43:24 +00:00
Adrian Prantl	2185aa179d	Revert "Refactor DebugLocDWARFExpression so it doesn't require access to the" This reverts commit 230975 to investigate buildbot breakage. llvm-svn: 231004	2015-03-02 20:01:54 +00:00
Adrian Prantl	abb9192652	Revert "For the dwarf expression code get the subtarget off of the current" This reverts commit 230990 because also reverting 230975. llvm-svn: 231003	2015-03-02 20:01:47 +00:00
Paul Robinson	c583abc888	Remove useless .debug_macinfo section setup. llvm-svn: 231001	2015-03-02 19:52:42 +00:00
Eric Christopher	d8cacd2e97	For the dwarf expression code get the subtarget off of the current MachineFunction. llvm-svn: 230990	2015-03-02 19:01:47 +00:00
Juergen Ributzka	a57d588cb7	Restore LLVMLinkModules C API until it is properly deprecated. Add the enum "LLVMLinkerMode" back for backwards-compatibility and add the linker mode parameter back to the "LLVMLinkModules" function. The paramter is ignored and has no effect. Patch provided by: Filip Pizlo Reviewed by: Rafael and Sean llvm-svn: 230988	2015-03-02 18:59:38 +00:00
Jan Vesely	468e055f54	R600: Use c++11 style for loop Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 230987	2015-03-02 18:56:52 +00:00
Paul Robinson	9f4cfc574e	Revert r230979, should apply to all X86 ELF. llvm-svn: 230985	2015-03-02 18:50:18 +00:00
Paul Robinson	10ae2e52de	[PS4] Correct relocation for DWARF TLS references. llvm-svn: 230979	2015-03-02 17:44:52 +00:00
Justin Bogner	64d2cdf4ec	Detect malformed YAML sequence in yaml::Input::beginSequence() When reading a yaml::SequenceTraits object, YAMLIO does not report an error if the yaml item is not a sequence. Instead, YAMLIO reads an empty sequence. For example: --- seq: foo: 1 bar: 2 ... If `seq` is a SequenceTraits object, then reading the above yaml will yield `seq` as an empty sequence. Fix this to report an error for the above mapping ("not a sequence") Patch by William Fisher. Thanks! llvm-svn: 230976	2015-03-02 17:26:43 +00:00
Adrian Prantl	d50bca7314	Refactor DebugLocDWARFExpression so it doesn't require access to the TargetRegisterInfo. DebugLocEntry now holds a buffer with the raw bytes of the pre-calculated DWARF expression. Ought to be NFC, but it does slightly alter the output format of the textual assembly. This reapplies 230930 with a relaxed assertion in DebugLocEntry::finalize() that allows for empty DWARF expressions for constant FP values. llvm-svn: 230975	2015-03-02 17:21:06 +00:00
Benjamin Kramer	0b6742aeb5	Accidentaly inverted the condition again. Sorry. llvm-svn: 230973	2015-03-02 16:45:08 +00:00
Benjamin Kramer	f43de1879a	Avoid assertion in MSVC 2013 debug builds. llvm-svn: 230972	2015-03-02 16:42:56 +00:00
Benjamin Kramer	257ed69291	AsmWriter: Only print one space after the load type Before: %x = load i32, i32* %i After: %x = load i32, i32* %i Purely cosmetic, so no new test case. llvm-svn: 230966	2015-03-02 15:24:41 +00:00
Benjamin Kramer	cc34ba687b	SLPVectorizer: Rewrite ArrayRef slice compare to be more idiomatic. NFC intended. llvm-svn: 230965	2015-03-02 15:24:36 +00:00
Elena Demikhovsky	18fd49602b	AVX-512: Add assembly parser support for Rounding mode By Asaf Badouh <asaf.badouh@intel.com> llvm-svn: 230962	2015-03-02 15:00:34 +00:00
Benjamin Kramer	e86b0f186b	NVPTX: Remove dead code. Fun fact: This file was never referenced since the initial checkin of the NVPTX backend. llvm-svn: 230957	2015-03-02 13:16:28 +00:00
Vasileios Kalintiris	e741eb2c7d	[mips] Optimize conditional moves where RHS is zero. Summary: When the RHS of a conditional move node is zero, we can utilize the $zero register by inverting the conditional move instruction and by swapping the order of its True/False operands. Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D7945 llvm-svn: 230956	2015-03-02 12:47:32 +00:00
Elena Demikhovsky	2689d78909	AVX-512: Simplified MOV patterns, no functional changes. llvm-svn: 230954	2015-03-02 12:46:21 +00:00
Benjamin Kramer	8008e9f624	Simplify code. NFC. llvm-svn: 230948	2015-03-02 11:57:04 +00:00
Owen Anderson	63fbf10c32	Teach the verifier to enforce that the alignment argument of memory intrinsics must be a power of 2. llvm-svn: 230941	2015-03-02 09:35:06 +00:00
Owen Anderson	5af4b21c2e	Teach DataLayout that alignments on basic types must be powers of two. Fixes assertion failures/crashes on bad datalayout specifications. llvm-svn: 230940	2015-03-02 09:35:03 +00:00
Owen Anderson	ab1c7a77d2	Teach DataLayout that ABI alignments for non-aggregate types must be non-zero. This manifested as assertions and/or crashes in later phases of optimization, depending on the build configuration. llvm-svn: 230939	2015-03-02 09:34:59 +00:00
Owen Anderson	040f2f890e	Teach DataLayout that pointer ABI and preferred alignments are required to be powers of two. Previously this resulted in asserts and/or crashes (depending on build configuration) at various phases in the optimizer. llvm-svn: 230938	2015-03-02 06:33:51 +00:00
Owen Anderson	5bc2bbe601	Teach DataLayout that zero-byte pointer sizes don't make sense. Previously this would result in assertion failures or simply crashes at various points in the optimizer when trying to create types of zero bit width. llvm-svn: 230936	2015-03-02 06:00:02 +00:00
Owen Anderson	576a9a2728	Teach the LLParser to fail gracefully when it encounters an invalid label name. Previous it would either assert in +Asserts, or crash in -Asserts. Found by fuzzing LLParser. llvm-svn: 230935	2015-03-02 05:25:09 +00:00
Owen Anderson	91bdf07650	Fix a crash in the LL parser where it failed to validate that the pointer operand of a GEP was valid. This manifested as an assertion failure in +Asserts builds, and a hard crash in -Asserts builds. Found by fuzzing the LL parser. llvm-svn: 230934	2015-03-02 05:25:06 +00:00
Zachary Turner	7797c726b9	[llvm-pdbdump] Many minor fixes and improvements A short list of some of the improvements: 1) Now supports -all command line argument, which implies many other command line arguments to simplify usage. 2) Now supports -no-compiler-generated command line argument to exclude compiler generated types. 3) Prints base class list. 4) -class-definitions implies -types. 5) Proper display of bitfields. 6) Can now distinguish between struct/class/interface/union. And a few other minor tweaks. llvm-svn: 230933	2015-03-02 04:39:56 +00:00
Nico Weber	968ceddca9	Revert r230930, it caused PR22747. llvm-svn: 230932	2015-03-02 04:37:11 +00:00
Craig Topper	9c26bcca5a	[X86] There are only 8 mask registers. Fail disassembly if instruction tries to reference more. llvm-svn: 230931	2015-03-02 03:33:11 +00:00
Adrian Prantl	e2c9e64532	Refactor DebugLocDWARFExpression so it doesn't require access to the TargetRegisterInfo. DebugLocEntry now holds a buffer with the raw bytes of the pre-calculated DWARF expression. Ought to be NFC, but it does slightly alter the output format of the textual assembly. llvm-svn: 230930	2015-03-02 02:38:18 +00:00
NAKAMURA Takumi	0cd23c842e	Revert r230921, "Revert some changes that were made to fix PR20680.", for now. It caused a failure on clang/test/Misc/backend-optimization-failure.cpp . llvm-svn: 230929	2015-03-02 01:14:03 +00:00
Craig Topper	09b27e7b24	[X86] Fix diassembler crash on AVX512 cmpps/cmppd with immediate that doesn't fit in 5-bits. Fixes PR22743. llvm-svn: 230924	2015-03-02 00:22:29 +00:00
Sanjoy Das	e5d1466ab3	[AArch64] fix an invalid-iterator-use bug. Summary: In AArch64PromoteConstant::appendAndTransferDominatedUses, `InsertPts[NewPt]` invalidates IPI. Therefore, `InsertPts[NewPt] = std::move(IPI->second)` is not legal. This was caught by running `make check` with http://reviews.llvm.org/D7931. Reviewers: t.p.northover, grosbach, bkramer Reviewed By: bkramer Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D7988 llvm-svn: 230923	2015-03-02 00:17:18 +00:00
Sanjoy Das	876bd51486	Revert some changes that were made to fix PR20680. Summary: As far as I can tell, the real bug causing the issue was fixed in r230533. SCEVExpander should mark an increment operation as nuw or nsw only if it can prove that the operation does not overflow. There shouldn't be any situation where we have to do something different because of no-wrap flags generated by SCEVExpander. Revert "IndVarSimplify: Allow LFTR to fire more often" This reverts commit 1ade0f0faa98877b688e0b9da58e876052c1e04e (SVN: 222213). Revert "IndVarSimplify: Don't let LFTR compare against a poison value" This reverts commit c0f2b8b528d8a37b0a1522aae90af649d6357eb5 (SVN: 217102). Reviewers: majnemer, atrick, spatel Differential Revision: http://reviews.llvm.org/D7979 llvm-svn: 230921	2015-03-01 23:36:26 +00:00
Benjamin Kramer	42cc33e816	X86: Replace variadic function with init list. NFC. llvm-svn: 230911	2015-03-01 21:47:40 +00:00
Benjamin Kramer	0a446fd56c	Add missing includes. make_unique proliferated everywhere. llvm-svn: 230909	2015-03-01 21:28:53 +00:00
Arnaud A. de Grandmaison	a57ca81eb4	[PBQP] Address post-commit style comment for r230904. NFC. Thanks David ! llvm-svn: 230908	2015-03-01 21:22:50 +00:00
Benjamin Kramer	030133c5db	ArrayRef: Remove the equals helper with many arguments. With initializer lists there is a really neat idiomatic way to write this, 'ArrayRef.equals({1, 2, 3, 4, 5})'. Remove the equal method which always had a hard limit on the number of arguments. I considered rewriting it with variadic templates but that's not really a good fit for a function with homogeneous arguments. 'ArrayRef == {1, 2, 3, 4, 5}' would've been even more awesome, but C++11 doesn't allow init lists with binary operators. llvm-svn: 230907	2015-03-01 21:05:05 +00:00
Arnaud A. de Grandmaison	21fa09890c	[PBQP] Do not add an edge between nodes with totally disjoint allowed registers Such edges are zero matrix, and they bring no additional info to the allocation problem, apart from contributing to nodes' degree. Removing those edges is expected to improve allocation time. Tune the spill cost comparison, as this gives better average performances now that the nodes' degrees has changed. llvm-svn: 230904	2015-03-01 20:39:34 +00:00
Benjamin Kramer	7149aabf8b	Make some non-constant static variables non-static or fully const. Otherwise we have to emit thread-safe initialization for them. NFC. llvm-svn: 230894	2015-03-01 18:09:56 +00:00
Elena Demikhovsky	0995479e67	Reverted 230471 - gather scatter handling in table gen. llvm-svn: 230892	2015-03-01 08:23:41 +00:00
Elena Demikhovsky	02ffd26023	AVX-512: Added mask and rounding mode for scalar arithmetics Added more tests for scalar instructions to destinguish between AVX and AVX-512 forms. llvm-svn: 230891	2015-03-01 07:44:04 +00:00
Zachary Turner	b52d08d9dd	[llvm-pdbdump] Clean up method signatures. llvm-svn: 230889	2015-03-01 06:51:29 +00:00
Sanjay Patel	b8c907e2a7	avoid infinite looping when folding vector multiplies of constants (PR22698) We were missing a check for the following fold in DAGCombiner: // fold (fmul (fmul x, c1), c2) -> (fmul x, (fmul c1, c2)) If 'x' is also a constant, then we shouldn't do anything. Otherwise, we could end up swapping the operands back and forth forever. This should fix: http://llvm.org/bugs/show_bug.cgi?id=22698 Differential Revision: http://reviews.llvm.org/D7917 llvm-svn: 230884	2015-03-01 00:09:35 +00:00
Duncan P. N. Exon Smith	9c3b89448a	DebugInfo: Use TempMDNode in DIDescriptor::replaceAllUsesWith() Start using `TempMDNode` in `DIDescriptor::replaceAllUsesWith()` (effectively `std::unique_ptr<MDNode, MDNode::deleteTemporary>`). Besides making ownership more explicit, this prepares for when `DIDescriptor` refers to nodes that are not `MDTuple`. The old logic for "replacing" a node with itself used `MDNode::get()` to return a new (uniqued) `MDTuple`, while the new logic just defers to `MDNode::replaceWithUniqued()` (which also typically saves an allocation and RAUW traffic by mutating the temporary in place). llvm-svn: 230879	2015-02-28 23:48:02 +00:00
Duncan P. N. Exon Smith	16d182acb9	Optimize metadata node fields for CHECK-ability While gaining practical experience hand-updating CHECK lines (for moving the new debug info hierarchy into place), I learnt a few things about CHECK-ability of the specialized node assembly output. - The first part of a `CHECK:` is to identify the "right" node (this is especially true if you intend to use the new `CHECK-SAME` feature, since the first CHECK needs to identify the node correctly before you can split the line). - If there's a `tag:`, it should go first. - If there's a `name:`, it should go next (followed by the `linkageName:`, if any). - If there's a `scope:`, it should follow after that. - When a node type supports multiple DW_TAGs, but one is implied by its name and is overwhelmingly more common, the `tag:` field is terribly uninteresting unless it's different. - `MDBasicType` is almost always `DW_TAG_base_type`. - `MDTemplateValueParameter` is almost always `DW_TAG_template_value_parameter`. - Printing `name: ""` doesn't improve CHECK-ability, and there are far more nodes than I realized that are commonly nameless. - There are a few other fields that similarly aren't very interesting when they're empty. This commit updates the `AsmWriter` as suggested above (and makes necessary changes in `LLParser` for round-tripping). llvm-svn: 230877	2015-02-28 23:21:38 +00:00
Sanjay Patel	1c3eaecc09	fix typo; NFC llvm-svn: 230876	2015-02-28 22:25:06 +00:00
Duncan P. N. Exon Smith	c296fcc39e	AsmWriter: Escape string fields in metadata Properly escape string fields in metadata. I've added a spot-check with direct coverage for `MDFile::getFilename()`, but we'll get more coverage once the hierarchy is moved into place (since this comes up in various checked-in testcases). I've replicated the `if` logic using the `ShouldSkipEmpty` flag (although a follow-up commit is going to change how often this flag is specified); no NFCI other than escaping the string fields. llvm-svn: 230875	2015-02-28 22:20:16 +00:00
Duncan P. N. Exon Smith	79cf9705c7	AsmWriter: Extract writeStringField(), NFCI Extract logic for escaping a string field in the new debug info hierarchy from `GenericDebugNode`. A follow-up commit will use it far more widely (hence the dead code for `ShouldSkipEmpty`). llvm-svn: 230873	2015-02-28 22:16:56 +00:00
Zachary Turner	ccf0415973	[llvm-pdbdump] Better error handling. Previously it was impossible to distinguish between "There is no PDB implementation for this platform" and "I tried to load the PDB, but couldn't find the file", making it hard to figure out if you built llvm-pdbdump incorrectly or if you just mistyped a file name. This patch adds proper error handling so that we can know exactly what went wrong. llvm-svn: 230868	2015-02-28 20:23:18 +00:00
Benjamin Kramer	49a1132976	DwarfAccelTable: We know how many hashes we have in the output, just reserve the precise number llvm-svn: 230865	2015-02-28 20:15:00 +00:00
Benjamin Kramer	48ea372d90	StackColoring: Move set instead of copying. NFC. llvm-svn: 230864	2015-02-28 20:14:38 +00:00
Benjamin Kramer	4c5dcb0a83	LiveRange: Replace a creative vector erase loop with std::remove_if. I didn't see this so far because it scans backwards, but that doesn't make it any less quadratic. NFC. llvm-svn: 230863	2015-02-28 20:14:27 +00:00
Mehdi Amini	04f0f5ba61	Fixup for recent -fast-isel-abort change: code didn't match description Level 1 should abort for all instructions but call/terminators/args. Instead it was aborting only if the level was > 2 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 230861	2015-02-28 19:34:54 +00:00
Craig Topper	782d620657	[X86] Remove the blendpd/blendps/pblendw/pblendd intrinsics. They can represented by shuffle_vector instructions. llvm-svn: 230860	2015-02-28 19:33:17 +00:00
Zachary Turner	9e1ce99d81	[raw_ostream] When printing color on Windows, use correct bg color. When using SetConsoleTextAttribute() to set the foreground or background color, if you don't explicitly set both colors, then a default value of black will be chosen for whichever you don't specify a value for. This is annoying when you have a non default console background color, for example, and you try to set the foreground color. This patch gets the existing fg/bg color and when you set one attribute, sets the opposite attribute to its existing color prior to comitting the update. Reviewed by: Aaron Ballman Differential Revision: http://reviews.llvm.org/D7967 llvm-svn: 230859	2015-02-28 19:08:27 +00:00
Alexei Starovoitov	1b7b56fbcc	bpf: fix build complete the plumbing of passing TargetRegisterInfo through computeRegisterProperties started by r230583 llvm-svn: 230858	2015-02-28 18:03:04 +00:00
Benjamin Kramer	cb570f1bc9	TRE: Just erase dead BBs and tweak the iteration loop not to increment the deleted BB iterator. Leaving empty blocks around just opens up a can of bugs like PR22704. Deleting them early also slightly simplifies code. Thanks to Sanjay for the IR test case. llvm-svn: 230856	2015-02-28 16:47:27 +00:00
Yaron Keren	d602c35eca	Silence three more variable set but not used warnings, NFC. llvm-svn: 230853	2015-02-28 15:29:17 +00:00
Benjamin Kramer	5fbfe2ffdc	Convert push_back loops into append calls. No functionality change intended. llvm-svn: 230849	2015-02-28 13:20:15 +00:00
Yaron Keren	42a7adf171	Silence variable set but not used warning, NFC. llvm-svn: 230848	2015-02-28 13:11:24 +00:00
Benjamin Kramer	f1362f6196	ArrayRefize memory operand folding. NFC. llvm-svn: 230846	2015-02-28 12:04:00 +00:00
Benjamin Kramer	4f6ac16292	Replace std::copy with a back inserter with vector append where feasible All of the cases were just appending from random access iterators to a vector. Using insert/append can grow the vector to the perfect size directly and moves the growing out of the loop. No intended functionalty change. llvm-svn: 230845	2015-02-28 10:11:12 +00:00
Philip Reames	28e61ce60f	[RewriteStatepointsForGC] Reduce indentation via early continue [NFC] llvm-svn: 230836	2015-02-28 01:57:44 +00:00
Philip Reames	2e5bcbe8d5	[RewriteStatepointsForGC] Fix another order of iteration bug It turns out the naming of inserted phis and selects is sensative to the order in which two sets are iterated. We need to nail this down to avoid non-deterministic output and possible test failures. The modified test is the one I first noticed something odd in. The change is making it more strict to report the error. With the test change, but without the code change, the test fails roughly 1 in 5. With the code change, I've run ~30 runs without error. Long term, the right fix here is to adjust the naming scheme. I'm checking in this hack to avoid any possible non-determinism in the tests over the weekend. HJust because I only noticed one case doesn't mean it's actually the only case. I hope to get to the right change Monday. std->llvm data structure changes bugfix change #3 llvm-svn: 230835	2015-02-28 01:52:09 +00:00
Philip Reames	f986d68b36	[RewriteStatepointsForGC] Reduce indentation via early continue [NFC] llvm-svn: 230829	2015-02-28 00:54:41 +00:00
Philip Reames	a226e6115c	[RewriteStatepointsForGC] Fix iterator invalidation bug Inserting into a DenseMap you're iterating over is not well defined. This is unfortunate since this is well defined on a std::map. "cleanup per llvm code style standards" bug #2 llvm-svn: 230827	2015-02-28 00:47:50 +00:00
Philip Reames	a5aeaf4b4f	[RewriteStatepointsForGC] Add tests for the base pointer identification algorithm These tests cover the 'base object' identification and rewritting portion of RewriteStatepointsForGC. These aren't completely exhaustive, but they've proven to be reasonable effective over time at finding regressions. In the process of porting these tests over, I found my first "cleanup per llvm code style standards" bug. We were relying on the order of iteration when testing the base pointers found for a derived pointer. When we switched from std::set to DenseSet, this stopped being a safe assumption. I'm suspecting I'm going to find more of those. In particular, I'm now really wondering about the main iteration loop for this algorithm. I need to go take a closer look at the assumptions there. I'm not really happy with the fact these are testing what is essentially debug output (i.e. enabled via command line flags). Suggestions for how to structure this better are very welcome. llvm-svn: 230818	2015-02-28 00:20:48 +00:00
Benjamin Kramer	012b1514b9	MachineDominators: Move applySplitCriticalEdges into the cpp file. It's too big for inlining anyways. Also clean it up slightly. No functionality change intended. llvm-svn: 230806	2015-02-27 23:13:13 +00:00
Bill Schmidt	e3959eb54e	[PowerPC] Fix PR22711 - Misaligned .toc section Straightforward patch to emit an alignment directive when emitting a TOC entry. The test case was generated from the test in PR22711 that demonstrated a misaligned .toc section. The object code is run through llvm-readobj to verify that the correct alignment has been applied to the .toc section. Thanks to Ulrich Weigand for running down where the fix was needed. llvm-svn: 230801	2015-02-27 22:14:10 +00:00
Benjamin Kramer	4e3b903a95	Reduce double set lookups. llvm-svn: 230798	2015-02-27 21:43:14 +00:00
David Blaikie	a79ac14fa6	[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace$\d+$ )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794	2015-02-27 21:17:42 +00:00
Charles Davis	83687fb9e6	Target/X86: Never use the redzone for Win64 ABI functions. Summary: Until now, we did this (among other things) based on whether or not the target was Windows. This is clearly wrong, not just for Win64 ABI functions on non-Windows, but for System V ABI functions on Windows, too. In this change, we make this decision based on the ABI the calling convention specifies instead. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7953 llvm-svn: 230793	2015-02-27 21:11:16 +00:00
Hal Finkel	5c3cacf5c0	[PowerPC] Use vector types for memcpy and friends (sometimes) When using Altivec, we can use vector loads and stores for aligned memcpy and friends. Starting with the P7 and VXS, we have reasonable unaligned vector stores. Starting with the P8, we have fast unaligned loads too. For QPX, we use vector loads are stores, but only for aligned memory accesses. llvm-svn: 230788	2015-02-27 19:58:28 +00:00
David Blaikie	79e6c74981	[opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction One of several parallel first steps to remove the target type of pointers, replacing them with a single opaque pointer type. This adds an explicit type parameter to the gep instruction so that when the first parameter becomes an opaque pointer type, the type to gep through is still available to the instructions. * This doesn't modify gep operators, only instructions (operators will be handled separately) * Textual IR changes only. Bitcode (including upgrade) and changing the in-memory representation will be in separate changes. * geps of vectors are transformed as: getelementptr <4 x float> %x, ... ->getelementptr float, <4 x float> %x, ... Then, once the opaque pointer type is introduced, this will ultimately look like: getelementptr float, <4 x ptr> %x with the unambiguous interpretation that it is a vector of pointers to float. * address spaces remain on the pointer, not the type: getelementptr float addrspace(1)* %x ->getelementptr float, float addrspace(1)* %x Then, eventually: getelementptr float, ptr addrspace(1) %x Importantly, the massive amount of test case churn has been automated by same crappy python code. I had to manually update a few test cases that wouldn't fit the script's model (r228970,r229196,r229197,r229198). The python script just massages stdin and writes the result to stdout, I then wrapped that in a shell script to handle replacing files, then using the usual find+xargs to migrate all the files. update.py: import fileinput import sys import re ibrep = re.compile(r"(^.?[^%\w]getelementptr inbounds )(((?:<\d x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") normrep = re.compile( r"(^.?[^%\w]getelementptr )(((?:<\d* x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") def conv(match, line): if not match: return line line = match.groups()[0] if len(match.groups()[5]) == 0: line += match.groups()[2] line += match.groups()[3] line += ", " line += match.groups()[1] line += "\n" return line for line in sys.stdin: if line.find("getelementptr ") == line.find("getelementptr inbounds"): if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("): line = conv(re.match(ibrep, line), line) elif line.find("getelementptr ") != line.find("getelementptr ("): line = conv(re.match(normrep, line), line) sys.stdout.write(line) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name .ll \| xargs ./apply.sh From llvm/src/tools/clang: find test/ -name .mm -o -name .m -o -name .cpp -o -name .c \| xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll \| xargs ./apply.sh After that, check-all (with llvm, clang, clang-tools-extra, lld, compiler-rt, and polly all checked out). The extra 'rm' in the apply.sh script is due to a few files in clang's test suite using interesting unicode stuff that my python script was throwing exceptions on. None of those files needed to be migrated, so it seemed sufficient to ignore those cases. Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7636 llvm-svn: 230786	2015-02-27 19:29:02 +00:00
Eric Christopher	3b94e33277	Remove the Forward Control Flow Integrity pass and its dependencies. This work is currently being rethought along different lines and if this work is needed it can be resurrected out of svn. Remove it for now as no current work in ongoing on it and it's unused. Verified with the authors before removal. llvm-svn: 230780	2015-02-27 19:03:38 +00:00
Mehdi Amini	945a660cbc	Change the fast-isel-abort option from bool to int to enable "levels" Summary: Currently fast-isel-abort will only abort for regular instructions, and just warn for function calls, terminators, function arguments. There is already fast-isel-abort-args but nothing for calls and terminators. This change turns the fast-isel-abort options into an integer option, so that multiple levels of strictness can be defined. This will help no being surprised when the "abort" option indeed does not abort, and enables the possibility to write test that verifies that no intrinsics are forgotten by fast-isel. Reviewers: resistor, echristo Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D7941 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 230775	2015-02-27 18:32:11 +00:00
Rafael Espindola	629cdbae94	Centralize handling of the eh_begin and eh_end labels. This removes a bit of duplicated code and more importantly, remembers the labels so that they don't need to be looked up by name. This in turn allows for any name to be used and avoids a crash if the name we wanted was already taken. llvm-svn: 230772	2015-02-27 18:18:39 +00:00
Sanjay Patel	af0ff1093e	remove function names from comments; NFC llvm-svn: 230771	2015-02-27 18:07:41 +00:00
Sanjay Patel	b92e9164d2	remove function names from comments; NFC llvm-svn: 230766	2015-02-27 17:27:15 +00:00
Renato Golin	a78995c0a0	Equally to NetBSD, Bitrig/ARM uses the Itanium-ABI. Patch by Patrick Wildt. llvm-svn: 230762	2015-02-27 16:35:27 +00:00
Zoran Jovanovic	71a33e2ad6	[mips][microMIPS] Change register class for GP register Differential Revision: http://reviews.llvm.org/D7934 llvm-svn: 230760	2015-02-27 15:03:50 +00:00
Tom Stellard	aec94b3bf3	R600/SI: Add missing mubuf instructions llvm-svn: 230759	2015-02-27 14:59:46 +00:00
Tom Stellard	49282c92c5	R600/SI: Consistently put soffset before the offset operand for mubuf instructions This matches the assembly syntax. llvm-svn: 230758	2015-02-27 14:59:44 +00:00
Tom Stellard	1f9939fba6	R600/SI: Add slc, glc, and tfe to non-atomic _ADDR64 instructions llvm-svn: 230757	2015-02-27 14:59:41 +00:00
Chandler Carruth	9ad2ffac23	[x86] Run most of the rest of the shuffle combining over non-128-bit vectors. This lets us fix the rest of the v16 lowering problems when pshufb is clearly better. We might still be able to improve some of the lowerings by enabling the other combine-based rewriting to fire for non-128-bit vectors, but this at least should remove any regressions from using the fancy v16i16 lowering strategy. llvm-svn: 230753	2015-02-27 12:13:14 +00:00
Chandler Carruth	66b705bc64	[x86] Teach a bunch of the x86-specific shuffle combining to work with 256-bit vectors as well as 128-bit vectors. Fixes some of the redundant shuffles for v16i16. llvm-svn: 230752	2015-02-27 11:45:13 +00:00
Chandler Carruth	97f3260f57	[x86] Make the v8i16 clever single-input shuffle lowering usable for repeated 128-bit lane shuffles of wider vector types and use it to lower 256-bit v16i16 vector shuffles where applicable. This should let us perfectly lowering the pattern of pshuflw and pshufhw even for AVX2 256-bit patterns. I've not added AVX-512 support, but it should be trivial for someone working on that to wire up. Note that currently this generates bad, long shuffle chains because we don't combine 256-bit target shuffles. The subsequent patches will fix that. llvm-svn: 230751	2015-02-27 11:33:46 +00:00
Toma Tabacu	344c167436	[mips] Remove redundant periods from -mattr=help descriptions for MIPS. Summary: Also fixes an infringement of the 80-column limit rule. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7910 llvm-svn: 230748	2015-02-27 10:44:02 +00:00
Chandler Carruth	ddc4d085cc	[x86] Make the single-input v8i16 lowering directly recurse rather than going back through the entire vector shuffle lowering. This is an important step to being able to re-use this logic. llvm-svn: 230743	2015-02-27 09:11:38 +00:00
Vasileios Kalintiris	18581f16b4	[mips] Account for constant-zero operands in ADDE nodes. Summary: We identify the cases where the operand to an ADDE node is a constant zero. In such cases, we can avoid generating an extra ADDu instruction disguised as an identity move alias (ie. addu $r, $r, 0 --> move $r, $r). Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7906 llvm-svn: 230742	2015-02-27 09:01:39 +00:00
Anna Zaks	8ed1d8196b	[asan] Skip promotable allocas to improve performance at -O0 Currently, the ASan executables built with -O0 are unnecessarily slow. The main reason is that ASan instrumentation pass inserts redundant checks around promotable allocas. These allocas do not get instrumented under -O1 because they get converted to virtual registered by mem2reg. With this patch, ASan instrumentation pass will only instrument non promotable allocas, giving us a speedup of 39% on a collection of benchmarks with -O0. (There is no measurable speedup at -O1.) llvm-svn: 230724	2015-02-27 03:12:36 +00:00
Sanjoy Das	b818676f6d	Don't modify the DenseMap being iterated over from within the loop that is iterating over it Inserting elements into a `DenseMap` invalidated iterators pointing into the `DenseMap` instance. Differential Revision: http://reviews.llvm.org/D7924 llvm-svn: 230719	2015-02-27 02:24:16 +00:00
Charles Davis	84d28de627	Target/X86: Save Win64 non-volatile registers in a Win64 ABI function. Summary: This change causes us to actually save non-volatile registers in a Win64 ABI function that calls a System V ABI function, and vice-versa. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7919 llvm-svn: 230714	2015-02-27 00:57:01 +00:00
Eric Christopher	1cdefae9c4	Rewrite MachineOperand::print and MachineInstr::print to avoid uses of TM->getSubtargetImpl and propagate to all calls. This could be a debugging regression in places where we had a TargetMachine and/or MachineFunction but don't have it as part of the MachineInstr. Fixing this would require passing a MachineFunction/Function down through the print operator, but none of the existing uses in tree seem to do this. llvm-svn: 230710	2015-02-27 00:11:34 +00:00
Rafael Espindola	4491d0d337	Put jump tables in distinct sections if -ffunction-sections is used. A small regression in r230411 was that we were basing the decision on -fdata-sections. llvm-svn: 230707	2015-02-26 23:55:11 +00:00
Zachary Turner	d270d22f35	[llvm-pdbdump] Fix dumping of function pointers and basic types. Function pointers were not correctly handled by the dumper, and they would print as "* name". They now print as "int (__cdecl *name)(int arg1, int arg2)" as they should. Also, doubles were being printed as floats. This fixes that bug as well, and adds tests for all builtin types. as well as a test for function pointers. llvm-svn: 230703	2015-02-26 23:49:23 +00:00
Eric Christopher	b9f0009b5a	Remove DebugLoc::print(LLVMContext, raw_ostream), it was just forwarding to the one that didn't take a context. llvm-svn: 230700	2015-02-26 23:32:17 +00:00
Eric Christopher	11e4df73c8	getRegForInlineAsmConstraint wants to use TargetRegisterInfo for a lookup, pass that in rather than use a naked call to getSubtargetImpl. This involved passing down and around either a TargetMachine or TargetRegisterInfo. Update all callers/definitions around the targets and SelectionDAG. llvm-svn: 230699	2015-02-26 22:38:43 +00:00
Eric Christopher	d75c00c638	Add a TargetMachine argument to the AddressingModeMatcher, we'll need this shortly to get a TargetRegisterInfo from the subtarget for TargetLowering routines. llvm-svn: 230698	2015-02-26 22:38:34 +00:00
Chandler Carruth	653773d004	[x86] Fix PR22706 where we would incorrectly try lower a v32i8 dynamic blend as legal. We made the same mistake in two different places. Whenever we are custom lowering a v32i8 blend we need to check whether we are custom lowering it only for constant conditions that can be shuffled, or whether we actually have AVX2 and full dynamic blending support on bytes. Both are fixed, with comments added to make it clear what is going on and a new test case. llvm-svn: 230695	2015-02-26 22:15:34 +00:00
Rafael Espindola	e8fd00dab0	Simplify arange output. Move SectionMap to its only user (emitDebugARanges) and reorder to save a call to sort. llvm-svn: 230693	2015-02-26 22:02:02 +00:00
Chandler Carruth	7bd840d058	[x86] Restructure the comments and the conditions for handling dynamic blends. This makes it much more clear what is going on. The case we're handling is that of dynamic conditions, and we're bailing when the nature of the vector types and subtarget preclude lowering the dynamic condition vselect as an actual blend. No functionality changed here, but this will make a subsequent bug-fix to this code much more clear. llvm-svn: 230690	2015-02-26 21:29:06 +00:00
Chandler Carruth	efc6819041	[x86] Re-order the combines of select in the X86 backend. This doesn't change functionality, but makes it more clear that the dynamic case and the shuffle case don't overlap in any interesting way. llvm-svn: 230689	2015-02-26 21:21:36 +00:00
Chandler Carruth	0757f14c69	[x86] Add an assert to catch if we ever try to blend a v32i8 without AVX2. llvm-svn: 230688	2015-02-26 21:18:20 +00:00
Reid Kleckner	542a45435f	Silence some Win64 clang-cl warnings about unused stuff due to ifdefs llvm-svn: 230685	2015-02-26 21:08:21 +00:00
Reid Kleckner	1aecd5b8d9	Use wider type for overflow check on LLP64 platforms like Win64, found by clang-cl -Wtautological llvm-svn: 230684	2015-02-26 21:07:30 +00:00
Justin Bogner	43e51634bb	InstrProf: Simplify the construction of BinaryCoverageReader Creating BinaryCoverageReader is a strange and complicated dance where the constructor sets error codes that member functions will later read, and the object is in an invalid state if readHeader isn't immediately called after construction. Instead, make the constructor private and add a static create method to do the construction properly. This also has the benefit of removing readHeader completely and simplifying the interface of the object. llvm-svn: 230676	2015-02-26 20:06:28 +00:00
Justin Bogner	e84891a459	InstrProf: Rename ObjectFileCoverageMappingReader to BinaryCoverageReader The current name is long and confusing. A shorter one is both easier to understand and easier to work with. llvm-svn: 230675	2015-02-26 20:06:24 +00:00
Sanjoy Das	54ef895137	SCEVExpander incorrectly marks generated subtractions as nuw/nsw It is not sound to mark the increment operation as `nuw` or `nsw` based on a proof off of the add recurrence if the increment operation we emit happens to be a `sub` instruction. I could not come up with a test case for this -- the cases where SCEVExpander decides to emit a `sub` instruction is quite small, and I cannot think of a way I'd be able to get SCEV to prove that the increment does not overflow in those cases. Differential Revision: http://reviews.llvm.org/D7899 llvm-svn: 230673	2015-02-26 19:51:35 +00:00
Frederic Riss	adbb3f207f	[MC] Use the non-EH register mapping in the debug_frame section. On 32bits x86 Darwin, the register mappings for the eh_frane and debug_frame sections are different. Thus the same CFI instructions should result in different registers in the object file. The problem isn't target specific though, but it requires that the mappings for EH register numbers be different from the standard Dwarf one. The patch looks a bit clumsy. LLVM uses the EH mapping as canonical for everything frame related. Thus we need to do a double conversion EH -> LLVM -> Non-EH, when emitting the debug_frame section. Fixes PR22363. Differential Revision: http://reviews.llvm.org/D7593 llvm-svn: 230670	2015-02-26 19:48:07 +00:00
Reid Kleckner	e81017248c	Don't sibcall between SysV and Win64 convention functions The shadow stack space expectations won't match. Fixes PR22709. llvm-svn: 230667	2015-02-26 19:43:20 +00:00
Hal Finkel	221f467185	[InstCombine/PowerPC] Convert aligned QPX load/store intrinsics into loads/stores InstCombine has long had logic to convert aligned Altivec load/store intrinsics into regular loads and stores. This mirrors that functionality for QPX vector load/store intrinsics. llvm-svn: 230660	2015-02-26 18:56:03 +00:00
Paul Robinson	093d6e1a70	When the source has a series of assignments, users reasonably want to have the debugger step through each one individually. Turn off the combine for adjacent stores at -O0 so we get this behavior. Possibly, DAGCombine shouldn't run at all at -O0, but that's for another day; see PR22346. Differential Revision: http://reviews.llvm.org/D7181 llvm-svn: 230659	2015-02-26 18:47:57 +00:00
Petar Jovanovic	90ec1b175e	Fix justify error for small structures in varargs for MIPS64BE There was a problem when passing structures as variable arguments. The structures smaller than 64 bit were not left justified on MIPS64 big endian. This is now fixed by shifting the value to make it left- justified when appropriate. This fixes the bug http://llvm.org/bugs/show_bug.cgi?id=21608 Patch by Aleksandar Beserminji. Differential Revision: http://reviews.llvm.org/D7881 llvm-svn: 230657	2015-02-26 18:35:15 +00:00
Sumanth Gundapaneni	28a3b86b06	Use ".arch_extension" ARM directive to support hwdiv on krait In case of "krait" CPU, asm printer doesn't emit any ".cpu" so the features bits are not computed. This patch lets the asm printer emit ".cpu cortex-a9" directive for krait and the hwdiv feature is enabled through ".arch_extension". In short, krait is treated as "cortex-a9" with hwdiv. We can not emit ".krait" as CPU since it is not supported bu GNU GAS yet llvm-svn: 230651	2015-02-26 18:08:41 +00:00
Sumanth Gundapaneni	a9049ea368	Use ".arch_extension" ARM directive to specify the additional CPU features This patch is in response to r223147 where the avaiable features are computed based on ".cpu" directive. This will work clean for the standard variants like cortex-a9. For custom variants which rely on standard cpu names for assembly, the additional features of a CPU should be propagated. This can be done via ".arch_extension" as long as the assembler supports it. The implementation for krait along with unit test will be submitted in next patch. llvm-svn: 230650	2015-02-26 18:07:35 +00:00
Adam Nemet	9cc0c3999d	[LV/LoopAccesses] Backward dependences are not safe just because the accesses are via different types Noticed this while generalizing the code for loop distribution. I confirmed with Arnold that this was indeed a bug and managed to create a testcase. llvm-svn: 230647	2015-02-26 17:58:48 +00:00
Tom Stellard	eb05c610b4	R600/SI: Remove M0 from DS assembly strings This matches the assembly syntax for the proprietary compiler. llvm-svn: 230645	2015-02-26 17:08:43 +00:00
Michael Kuperstein	4af7449659	[X86][Haswell][SchedModel] Fix WriteMULm latency. The latency for the WriteMULm class was set to 4, which is actually lower than the latency for WriteMULr (5). A better estimate would be 4 added to WriteMULr, that is, 9. llvm-svn: 230634	2015-02-26 14:30:09 +00:00
Chandler Carruth	8e0a3ea52c	[x86] Sink the single-input v8i16 lowering code that is actually formulaic into the top v8i16 lowering routine. This makes the generalized lowering a completely general and single path lowering which will allow generalizing it in turn for multiple 128-bit lanes. llvm-svn: 230623	2015-02-26 11:00:40 +00:00
Chandler Carruth	11e7f6b50a	[x86] Remove a SimpleTy usage. No need for it here, we already have the MVT. llvm-svn: 230622	2015-02-26 10:37:01 +00:00
Sanjoy Das	e91665de39	IRCE: only touch loops that have been shown to have a high backedge-taken count in profiliing data. llvm-svn: 230619	2015-02-26 08:56:04 +00:00
Sanjoy Das	e75ed92630	IRCE: generalize to handle loops with decreasing induction variables. IRCE can now split the iteration space for loops like: for (i = n; i >= 0; i--) a[i + k] = 42; // bounds check on access llvm-svn: 230618	2015-02-26 08:19:31 +00:00
Chandler Carruth	d283cb6203	[x86] Make the vector shuffle helpers order the SDLoc and MVT arguments. This ordering matches that of DAG.getNode. llvm-svn: 230617	2015-02-26 08:19:24 +00:00

... 4 5 6 7 8 ...

77887 Commits