llvm-project

Commit Graph

Author	SHA1	Message	Date
Owen Anderson	f8f259df48	Fix a bug in the LLParser where we failed to diagnose landingpads with non-constant clause operands. Fixing this also exposed a related issue where the landingpad under construction was not cleaned up when an error was raised, which would cause bad reference errors before the error could actually be printed. llvm-svn: 231634	2015-03-09 07:13:42 +00:00
Kevin Qin	aef68418de	[AArch64] Enable partial & runtime unrolling on cortex-a57 For inner one of nested loops, it is more likely to be a hot loop, and the runtime check can be promoted out from patch 0001, so the overhead is less, we can try a doubled threshold to unroll more loops. llvm-svn: 231632	2015-03-09 06:14:28 +00:00
Kevin Qin	715b01e979	Introduce runtime unrolling disable matadata and use it to mark the scalar loop from vectorization. Runtime unrolling is an expensive optimization which can bring benefit only if the loop is hot and iteration number is relatively large enough. For some loops, we know they are not worth to be runtime unrolled. The scalar loop from vectorization is one of the cases. llvm-svn: 231631	2015-03-09 06:14:18 +00:00
Kevin Qin	a998735def	Run LICM pass after loop unrolling pass. Runtime unrollng will introduce a runtime check in loop prologue. If the unrolled loop is a inner loop, then the proglogue will be inside the outer loop. LICM pass can help to promote the runtime check out if the checked value is loop invariant. llvm-svn: 231630	2015-03-09 06:14:07 +00:00
Mehdi Amini	eb242a5041	InstCombine: fix fold "fcmp x, undef" to account for NaN Summary: See the two test cases. ; Can fold fcmp with undef on one side by choosing NaN for the undef ; Can fold fcmp with undef on both side ; fcmp u_pred undef, undef -> true ; fcmp o_pred undef, undef -> false ; because whatever you choose for the first undef ; you can choose NaN for the other undef Reviewers: hfinkel, chandlerc, majnemer Reviewed By: majnemer Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D7617 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231626	2015-03-09 03:20:25 +00:00
Mehdi Amini	75eda5e913	DCE: isArrayMalloc() is not used neither in LLVM nor Clang From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231624	2015-03-09 02:57:32 +00:00
David Blaikie	dc3f01e9cf	Simplify expressions involving boolean constants with clang-tidy Patch by Richard (legalize at xmission dot com). Differential Revision: http://reviews.llvm.org/D8154 llvm-svn: 231617	2015-03-09 01:57:13 +00:00
Owen Anderson	7e621e9d5e	Teach DataLayout to infer a plausible alignment for things even when nothing is specified by the user. llvm-svn: 231613	2015-03-08 21:53:59 +00:00
Andrea Di Biagio	6c7d70469c	[X86][AVX] Fix wrong lowering of VPERM2X128 nodes There were cases where the backend computed a wrong permute mask for a VPERM2X128 node. Example: \code define <8 x float> @foo(<8 x float> %a, <8 x float> %b) { %shuffle = shufflevector <8 x float> %a, <8 x float> %b, <8 x i32> <i32 undef, i32 undef, i32 6, i32 7, i32 undef, i32 undef, i32 6, i32 7> ret <8 x float> %shuffle } \code end Before this patch, llc (with -mattr=+avx) emitted the following vperm2f128: vperm2f128 $0, %ymm0, %ymm0, %ymm0 # ymm0 = ymm0[0,1,0,1] With this patch, llc emits a vperm2f128 with a correct permute mask: vperm2f128 $17, %ymm0, %ymm0, %ymm0 # ymm0 = ymm0[2,3,2,3] Differential Revision: http://reviews.llvm.org/D8119 llvm-svn: 231601	2015-03-08 16:28:47 +00:00
Benjamin Kramer	57a3d084cd	Make static variables const if possible. Makes them go into a read-only section. Or fold them into a initializer list which has the same effect. NFC. llvm-svn: 231598	2015-03-08 16:07:39 +00:00
Simon Pilgrim	8c58c066b7	[DAGCombiner] Add a shuffle mask commutation helper function. NFCI. We have an increasing number of cases where we are creating commuted shuffle masks - all implementing nearly the same code. This patch adds a static helper function - ShuffleVectorSDNode::commuteMask() and replaces a number of cases to use it. Differential Revision: http://reviews.llvm.org/D8139 llvm-svn: 231581	2015-03-07 22:33:11 +00:00
David Majnemer	73460f94a2	Fix the autoconf build lib/ExecutionEngine/Targets has no Makefile, causing the autoconf build to fail. Solve this by bringing the COFF implementation of RuntimeDyld in line like the Mach-O and ELF implementations. llvm-svn: 231579	2015-03-07 21:47:46 +00:00
Benjamin Kramer	f027ad7883	Make the assertion macros in Verifier and Linter truly variadic. NFC. llvm-svn: 231577	2015-03-07 21:15:40 +00:00
David Majnemer	b654b55619	Fix unused variable/function warnings llvm-svn: 231576	2015-03-07 20:56:50 +00:00
David Majnemer	1a666e0f69	ExecutionEngine: Preliminary support for dynamically loadable coff objects Provide basic support for dynamically loadable coff objects. Only handles a subset of x64 currently. Patch by Andy Ayers! Differential Revision: http://reviews.llvm.org/D7793 llvm-svn: 231574	2015-03-07 20:21:27 +00:00
Benjamin Kramer	867bfc53ee	Make constant arrays that are passed to functions as const. In theory this allows the compiler to skip materializing the array on the stack. In practice clang often fails to do that, but that's a different story. NFC. llvm-svn: 231571	2015-03-07 17:41:00 +00:00
Simon Pilgrim	2dcbe74dfd	Use SDValue bool check to tidyup some possible combines. NFC. llvm-svn: 231569	2015-03-07 16:34:55 +00:00
Benjamin Kramer	4e0c7928e2	X86: Roll repetitive code into a loop. NFC. llvm-svn: 231565	2015-03-07 15:06:16 +00:00
Andrea Di Biagio	c9d79e8103	[DAGCombiner] Fix wrong folding of AND dag nodes. This patch fixes the logic in the DAGCombiner that folds an AND node according to rule: (and (X (load V)), C) -> (X (load V)) An AND between a vector load 'X' and a constant build_vector 'C' can be folded into the load itself only if we can prove that the AND operation is redundant. The algorithm implemented by 'visitAND' firstly computes the splat value 'S' from C, and then checks if S has the lower 'B' bits set (where B is the size in bits of the vector element type). The algorithm takes into account also the 'undef' bits in the splat mask. Unfortunately, the algorithm only worked under the assumption that the size of S is a multiple of the vector element type. With this patch, we conservatively avoid folding the AND if the splat bits are not compatible with the vector element type. Added X86 test and-load-fold.ll Differential Revision: http://reviews.llvm.org/D8085 llvm-svn: 231563	2015-03-07 12:24:55 +00:00
Chandler Carruth	df397c520d	[PM] Fixup for r231556 where I missed a dependency on intrinsics generation. llvm-svn: 231558	2015-03-07 09:08:20 +00:00
Chandler Carruth	1ff7724da5	[PM] Create a separate library for high-level pass management code. This will provide the analogous replacements for the PassManagerBuilder and other code long term. This code is extracted from the opt tool currently, and I plan to extend it as I build up support for using the new pass manager in Clang and other places. Mailing this out for review in part to let folks comment on the terrible names here. A brief word about why I chose the names I did. The library is called "Passes" to try and make it clear that it is a high-level utility and where all of the passes come together and are registered in a common library. I didn't want it to be limited to a registry though, the registry is just one component. The class is a "PassBuilder" but this name I'm less happy with. It doesn't build passes in any traditional sense and isn't a Builder-style API at all. The class is a PassRegisterer or PassAdder, but neither of those really make a lot of sense. This class is responsible for constructing passes for registry in an analysis manager or for population of a pass pipeline. If anyone has a better name, I would love to hear it. The other candidate I looked at was PassRegistrar, but that doesn't really fit either. There is no register of all the passes in use, and so I think continuing the "registry" analog outside of the registry of pass names and types is a mistake. The objects themselves are just objects with the new pass manager. Differential Revision: http://reviews.llvm.org/D8054 llvm-svn: 231556	2015-03-07 09:02:36 +00:00
Simon Pilgrim	bede80a440	[DAGCombiner] SCALAR_TO_VECTOR(EXTRACT_VECTOR_ELT(V,C)) -> VECTOR_SHUFFLE This patch attempts to convert a SCALAR_TO_VECTOR using an operand from an EXTRACT_VECTOR_ELT into a VECTOR_SHUFFLE. This prevents many cases of spilling scalar data between the gpr + simd registers. At present the optimization only accepts cases where there is no TRUNC of the scalar type (i.e. all types must match). Differential Revision: http://reviews.llvm.org/D8132 llvm-svn: 231554	2015-03-07 05:52:42 +00:00
Eric Christopher	25dbdeb4d1	Typo. llvm-svn: 231547	2015-03-07 01:39:09 +00:00
Eric Christopher	7e70aba1a8	Recommit r231324 with a fix to the ARM execution domain code to disable lane switching if we don't actually have the instruction set we want to switch to. Models the earlier check above the conditional for the pass. The testcase is one that triggered with the assert that's added as part of the fix, use it to avoid adding a new testcase as it highlights the same problem. llvm-svn: 231539	2015-03-07 00:12:22 +00:00
Olivier Sallenave	049d803ce0	Do not restrict interleaved unrolling to small loops, depending on the target. llvm-svn: 231528	2015-03-06 23:12:04 +00:00
Quentin Colombet	66b616351c	[AArch64][LoadStoreOptimizer] Generate LDP + SXTW instead of LD[U]R + LD[U]RSW. Teach the load store optimizer how to sign extend a result of a load pair when it helps creating more pairs. The rational is that loads are more expensive than sign extensions, so if we gather some in one instruction this is better! <rdar://problem/20072968> llvm-svn: 231527	2015-03-06 22:42:10 +00:00
Matthias Braun	898d11e864	DAGCombiner: Canonicalize select(and/or,x,y) depending on target. This is based on the following equivalences: select(C0 & C1, X, Y) <=> select(C0, select(C1, X, Y), Y) select(C0 \| C1, X, Y) <=> select(C0, X, select(C1, X, Y)) Many target cannot perform and/or on the CPU flags and therefore the right side should be choosen to avoid materializign the i1 flags in an integer register. If the target can perform this operation efficiently we normalize to the left form. Differential Revision: http://reviews.llvm.org/D7622 llvm-svn: 231507	2015-03-06 19:49:10 +00:00
Matthias Braun	3ecb557739	DAGCombiner: Factor out some and/or combines. This is in preparation for changing visitSELECT to normalize towards select(Cond0, select(Cond1, X, Y), Y); select(Cond0, X, select(Cond1, X, Y)) which perfom an implicit and/or of the conditions. The factored function contains all DAGCombine rules which reduce two values combined by an And/Or operation to a single value. This does not include rules involving constants as visitSELECT already handles that case. Differential Revision: http://reviews.llvm.org/D8026 llvm-svn: 231506	2015-03-06 19:49:06 +00:00
Benjamin Kramer	e8a64a20f2	LoopInterchange: Remove empty method. llvm-svn: 231503	2015-03-06 19:37:26 +00:00
Benjamin Kramer	79442920bf	LoopInterchange: Rephrase instruction moving using ilist's splice and factor it into a function + Random cleanups. No functional change. llvm-svn: 231501	2015-03-06 18:59:14 +00:00
Matthias Braun	046318b87e	ExecutionDepsFix: Indizes -> Indices. Translate german to english. llvm-svn: 231500	2015-03-06 18:56:20 +00:00
Eric Christopher	6a8bfe7198	Fix typo. llvm-svn: 231495	2015-03-06 18:20:23 +00:00
Tom Stellard	6b42f2d8aa	R600/SI: Remove unused register class llvm-svn: 231491	2015-03-06 17:00:16 +00:00
Benjamin Kramer	298a3a0567	Fold init() helpers into constructors. NFC. llvm-svn: 231486	2015-03-06 16:21:15 +00:00
Chad Rosier	99b3e022c4	Avoid calls to dumpPassInfo and RegionBase<Tr>::getNameStr() in RGPassManager if -debug-pass is not specified, as the string is only used when dumping pass information. There is a big cost of determining the name in ReginBase<Tr>:getNameStr() if the region's entry or exit block doesn't have a name. This is the case for the Release build, as names are not preserved by the front-end. RegionPass is mainly used by Polly, resulting in long compile time for one file of a customer application with the Release build (1m24s) vs Release+Asserts build (10s) when Polly is used. With this change, the compile time with the Release build went down to 8s. Patch by Sanjin Sijaric <ssijaric@codeaurora.org>! Phabricator: http://reviews.llvm.org/D8076 llvm-svn: 231485	2015-03-06 16:15:04 +00:00
James Molloy	dcc78ec386	[ConstantRange] Teach multiply to be cleverer about signed ranges. Multiplication is not dependent on signedness, so just treating all input ranges as unsigned is not incorrect. However it will cause overly pessimistic ranges (such as full-set) when used with signed negative values. Teach multiply to try to interpret its inputs as both signed and unsigned, and then to take the most specific (smallest population) as its result. llvm-svn: 231483	2015-03-06 15:50:47 +00:00
Bruno Cardoso Lopes	618c67a018	[AsmPrinter][TLOF] 32-bit MachO support for replacing GOT equivalents Add MachO 32-bit (i.e. arm and x86) support for replacing global GOT equivalent symbol accesses. Unlike 64-bit targets, there's no GOTPCREL relocation, and access through a non_lazy_symbol_pointers section is used instead. -- before _extgotequiv: .long _extfoo _delta: .long _extgotequiv-_delta -- after _delta: .long L_extfoo$non_lazy_ptr-_delta .section __IMPORT,__pointers,non_lazy_symbol_pointers L_extfoo$non_lazy_ptr: .indirect_symbol _extfoo .long 0 llvm-svn: 231475	2015-03-06 13:49:05 +00:00
Bruno Cardoso Lopes	52b1391df6	[AsmPrinter][TLOF] ARM64 MachO support for replacing GOT equivalents Follow up r230264 and add ARM64 support for replacing global GOT equivalent symbol accesses by references to the GOT entry for the final symbol instead, example: -- before .globl _foo _foo: .long 42 .globl _gotequivalent _gotequivalent: .quad _foo .globl _delta _delta: .long _gotequivalent-_delta -- after .globl _foo _foo: .long 42 .globl _delta Ltmp3: .long _foo@GOT-Ltmp3 llvm-svn: 231474	2015-03-06 13:48:45 +00:00
Toma Tabacu	4e0cf8e211	[mips] [IAS] Add missing constraints and improve testing for the .module directive. Summary: None of the .set directives can be used before the .module directives. The .set mips0/pop/push were not triggering this constraint. Also added testing for all the other implemented directives which are supposed to trigger this constraint. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7140 llvm-svn: 231465	2015-03-06 12:15:12 +00:00
Daniel Jasper	6adbd7aecf	Change the way in which error case is being handled. Specifically this: * Prevents an "unused" warning in non-assert builds. * In that error case return with out removing a child loop instead of looping forever. llvm-svn: 231459	2015-03-06 10:39:14 +00:00
Karthik Bhat	88db86dd29	Add a new pass "Loop Interchange" This pass interchanges loops to provide a more cache-friendly memory access. For e.g. given a loop like - for(int i=0;i<N;i++) for(int j=0;j<N;j++) A[j][i] = A[j][i]+B[j][i]; is interchanged to - for(int j=0;j<N;j++) for(int i=0;i<N;i++) A[j][i] = A[j][i]+B[j][i]; This pass is currently disabled by default. To give a brief introduction it consists of 3 stages- LoopInterchangeLegality : Checks the legality of loop interchange based on Dependency matrix. LoopInterchangeProfitability: A very basic heuristic has been added to check for profitibility. This will evolve over time. LoopInterchangeTransform : Which does the actual transform. LNT Performance tests shows improvement in Polybench/linear-algebra/kernels/mvt and Polybench/linear-algebra/kernels/gemver becnmarks. TODO: 1) Add support for reductions and lcssa phi. 2) Improve profitability model. 3) Improve loop selection algorithm to select best loop for interchange. Currently the innermost loop is selected for interchange. 4) Improve compile time regression found in llvm lnt due to this pass. 5) Fix issues in Dependency Analysis module. A special thanks to Hal for reviewing this code. Review: http://reviews.llvm.org/D7499 llvm-svn: 231458	2015-03-06 10:11:25 +00:00
David Majnemer	b61f4e403d	X86: Form IMGREL relocations for LLVM Functions We supported forming IMGREL relocations from ConstantExprs involving __ImageBase if the minuend was a GlobalVariable. Extend this functionality to all GlobalObjects. llvm-svn: 231456	2015-03-06 08:11:32 +00:00
Yaron Keren	322bdad085	Silence C4715 'not all control paths return a value' warnings. llvm-svn: 231455	2015-03-06 07:49:14 +00:00
Rui Ueyama	da9bc2e56d	Support: Improve performance of FileOutputBuffer on Windows We extend an underlying file before mmap'ing it, but it's not needed on Windows. Extending file is slow on Windows, so we should avoid doing that. The difference gets larger as the size of an output file gets larger. It shove off 2 seconds out of 25 seconds when linking chrome.dll with LLD, for example. llvm-svn: 231452	2015-03-06 06:07:32 +00:00
Michael Gottesman	6ff10c959a	[objc-arc] Sprinkle some more auto on some iterators. llvm-svn: 231447	2015-03-06 02:10:03 +00:00
Michael Gottesman	16e6a2057f	[objc-arc] Move the detection of potential uses or altering of a ref count onto PtrState. llvm-svn: 231446	2015-03-06 02:07:12 +00:00
Michael Zolotukhin	03dd1082ad	LegalizeTypes: Handle shift by 0 in ExpandShiftByConstant. Though such shifts are usually optimized away by combiner, we still can encounter them after a vector shift is legalized. llvm-svn: 231443	2015-03-06 01:13:01 +00:00
Rafael Espindola	a5b9e1cf39	Remember to move a type to the correct set when setting the body. We would set the body of a struct type (therefore making it non-opaque) but were forgetting to move it to the non-opaque set. Fixes pr22807. llvm-svn: 231442	2015-03-06 00:50:21 +00:00
Michael Gottesman	6080596328	[objc-arc] Move the checking of whether or not we can match onto PtrStates and out of the main dataflow. These refactored computations check whether or not we are at a stage of the sequence where we can perform a match. This patch moves the computation out of the main dataflow and into {BottomUp,TopDown}PtrState. llvm-svn: 231439	2015-03-06 00:34:42 +00:00
Michael Gottesman	4eae396ae9	[objc-arc] Refactor (Re-)initialization of PtrState from dataflow -> {TopDown,BottomUp}PtrState Class. This initialization occurs when we see a new retain or release. Before we performed the actual initialization inline in the dataflow. That is just messy. llvm-svn: 231438	2015-03-06 00:34:39 +00:00

1 2 3 4 5 ...

77660 Commits