llvm-project

Commit Graph

Author	SHA1	Message	Date
Nadav Rotem	2d9dec322e	Add support for bottom-up SLP vectorization infrastructure. This commit adds the infrastructure for performing bottom-up SLP vectorization (and other optimizations) on parallel computations. The infrastructure has three potential users: 1. The loop vectorizer needs to be able to vectorize AOS data structures such as (sum += A[i] + A[i+1]). 2. The BB-vectorizer needs this infrastructure for bottom-up SLP vectorization, because bottom-up vectorization is faster to compute. 3. A loop-roller needs to be able to analyze consecutive chains and roll them into a loop, in order to reduce code size. A loop roller does not need to create vector instructions, and this infrastructure separates the chain analysis from the vectorization. This patch also includes a simple (100 LOC) bottom up SLP vectorizer that uses the infrastructure, and can vectorize this code: void SAXPY(int x, int y, int a, int i) { x[i] = a * x[i] + y[i]; x[i+1] = a * x[i+1] + y[i+1]; x[i+2] = a * x[i+2] + y[i+2]; x[i+3] = a * x[i+3] + y[i+3]; } llvm-svn: 179117	2013-04-09 19:44:35 +00:00
Eric Christopher	caeddf5a96	Make check depend on all. llvm-svn: 179116	2013-04-09 19:42:12 +00:00
Chad Rosier	a08f30f093	[ms-inline asm] Use parsePrimaryExpr in lieu of parseExpression if we need to parse an identifier. Otherwise, parseExpression may parse multiple tokens, which makes it impossible to properly compute an immediate displacement. An example of such a case is the source operand (i.e., [Symbol + ImmDisp]) in the below example: __asm mov eax, [Symbol + ImmDisp] The existing test cases exercise this patch. rdar://13611297 llvm-svn: 179115	2013-04-09 19:34:59 +00:00
Eric Christopher	52ce7189c1	The .dwo section shouldn't contain the unrelocated values (and therefore not at all) of the pc or statement list. We also don't need to emit the compilation dir so save so space and time and don't bother. Fix up the testcase accordingly and verify that we don't emit the attributes or the items that they use. llvm-svn: 179114	2013-04-09 19:23:15 +00:00
Hal Finkel	21aad9a8e8	Cleanup PPCEarlyReturn Some general cleanup and only scan the end of a BB for branches (once we're done with the terminators and debug values, then there should not be any other branches). These address post-commit review suggestions by Bill Schmidt. No functionality change intended. llvm-svn: 179112	2013-04-09 18:25:18 +00:00
Nadav Rotem	abcc64fd13	Revert r176408 and r176407 to address PR15540. llvm-svn: 179111	2013-04-09 18:16:05 +00:00
Chad Rosier	e81309b3bf	[ms-inline asm] Maintain a StringRef to reference a symbol in a parsed operand, rather than deriving the StringRef from the Start and End SMLocs. Using the Start and End SMLocs works fine for operands such as [Symbol], but not for operands such as [Symbol + ImmDisp]. All existing test cases that reference a variable exercise this patch. rdar://13602265 llvm-svn: 179109	2013-04-09 17:53:49 +00:00
Benjamin Kramer	bbae991db6	DAGCombiner: Fold a shuffle on CONCAT_VECTORS into a new CONCAT_VECTORS if possible. This pattern occurs in SROA output due to the way vector arguments are lowered on ARM. The testcase from PR15525 now compiles into this, which is better than the code we got with the old scalarrepl: _Store: ldr.w r9, [sp] vmov d17, r3, r9 vmov d16, r1, r2 vst1.8 {d16, d17}, [r0] bx lr Differential Revision: http://llvm-reviews.chandlerc.com/D647 llvm-svn: 179106	2013-04-09 17:41:43 +00:00
Hal Finkel	b5899d5774	Use virtual base registers on PPC On PowerPC, non-vector loads and stores have r+i forms; however, in functions with large stack frames these were not being used to access slots far from the stack pointer because such slots were out of range for the signed 16-bit immediate offset field. This increases register pressure because we need a separate register for each offset (when the r+r form is used). By enabling virtual base registers, we can deal with large stack frames without unduly increasing register pressure. llvm-svn: 179105	2013-04-09 17:27:09 +00:00
Hal Finkel	059825b0f8	Convert test PowerPC/2007-09-07-LoadStoreIdxForms to FileCheck llvm-svn: 179104	2013-04-09 17:26:55 +00:00
Eli Bendersky	1cc814a8e6	Rewrite test/Linker tests to use FileCheck instead of grep. Some translations here are not 1x1 because there are grep\|grep chains that are non-trivial to implement in terms of FileCheck features. I made an effort for the tests to remain as similar as possible; do let me know if you notice anything fishy. The good news are that some buggy tests were fixed (grep \| not grep - a bug waiting to happen). llvm-svn: 179102	2013-04-09 16:51:13 +00:00
Rafael Espindola	c2413f59e4	Convert MachOObjectFile to a template. For now it is templated only on being 64 or 32 bits. I will add little/big endian next. llvm-svn: 179097	2013-04-09 14:49:08 +00:00
Alexey Samsonov	d60859b21e	DWARF parser: Fix DWARF-2/3 incompatibility: size of DW_FORM_ref_addr is the same as DW_FORM_addr in DWARF2, and is 4/8 bytes on 32/64-bit DWARF starting from DWARF3. Adding a test for this is a huge pain - generating and uploading pre-built binary with DWARF3 debug info is way too ugly, and writing fine-grained unittests for DebugInfo is impossible, as it doesn't expose any headers in include/llvm. That said, I'm going to choose the second approach and submit the patch exposing DebugInfo headers for review soon enough. llvm-svn: 179095	2013-04-09 14:09:42 +00:00
Michael Gottesman	ccc93e72e1	Converted 8x tests of SimplifyCFG to use FileCheck instead of grep. llvm-svn: 179087	2013-04-09 05:18:53 +00:00
Jakob Stoklund Olesen	c910feb4a8	Extract a function. llvm-svn: 179086	2013-04-09 05:11:52 +00:00
Nadav Rotem	757aec9507	Remove the confusing sentence. llvm-svn: 179085	2013-04-09 04:48:40 +00:00
Nadav Rotem	7b7585d153	Revert 179071 because it is not the right way to support non standard new/new[] operators. llvm-svn: 179084	2013-04-09 04:43:46 +00:00
Jakob Stoklund Olesen	2cfe46fd34	Compute correct frame sizes for SPARC v9 64-bit frames. The save area is twice as big and there is no struct return slot. The stack pointer is always 16-byte aligned (after adding the bias). Also eliminate the stack adjustment instructions around calls when the function has a reserved stack frame. llvm-svn: 179083	2013-04-09 04:37:47 +00:00
Rafael Espindola	eb8b211e61	More uses for SymbolTableEntryBase. llvm-svn: 179076	2013-04-09 01:04:06 +00:00
Rafael Espindola	5d6cec9bff	Add a SymbolTableEntryBase. Use it when we don't need to know if we have a 32 or 64 bit SymbolTableEntry. llvm-svn: 179074	2013-04-09 00:22:58 +00:00
Joe Groff	6cdbe3f6df	Fix PointerIntPair to be enum class compatible. Some parts of PointerIntPair assumed that the IntType of the pair was implicitly convertible to intptr_t, which is not the case for enum class values. Add a static_cast<intptr_t> to make these conversions explicit and allow PointerIntPair to be used with an enum class IntType. While we're here, rename some of the argument values so we don't have variables named "Int" floating around. llvm-svn: 179073	2013-04-09 00:01:51 +00:00
Rafael Espindola	65d601f96c	Add a SectionBase struct. Use it to share code and when we don't need to know if we have a 32 or 64 bit Section. llvm-svn: 179072	2013-04-08 23:57:13 +00:00
Nadav Rotem	9dd90ac5b4	c++ new operators are not malloc-like functions because they do not return uninitialized memory. Users may overide new-operators and implement any function that they like. llvm-svn: 179071	2013-04-08 23:40:47 +00:00
NAKAMURA Takumi	065fd35268	InstructionSimplify.cpp: Fix a ligature, "fi", to get rid of utf8 in comment. llvm-svn: 179066	2013-04-08 23:05:21 +00:00
Shuxin Yang	331f01dcb4	Redo the fix Benjamin Kramer committed in r178793 about iterator invalidation in Reassociate. I brazenly think this change is slightly simpler than r178793 because: - no "state" in functor - "OpndPtrs[i]" looks simpler than "&Opnds[OpndIndices[i]]" While I can reproduce the probelm in Valgrind, it is rather difficult to come up a standalone testing case. The reason is that when an iterator is invalidated, the stale invalidated elements are not yet clobbered by nonsense data, so the optimizer can still proceed successfully. Thank Benjamin for fixing this bug and generously providing the test case. llvm-svn: 179062	2013-04-08 22:00:43 +00:00
Nadav Rotem	fe47d58cf0	Update the docs about the fact that the loop vectorizer is enabled by default for -O3. llvm-svn: 179060	2013-04-08 21:34:49 +00:00
Rafael Espindola	c0406e162c	Template the MachO types over the word size. llvm-svn: 179051	2013-04-08 20:45:01 +00:00
Rafael Espindola	29d4501774	Remove is64BitLoadCommand. llvm-svn: 179048	2013-04-08 20:18:53 +00:00
Eli Bendersky	19654c01c1	Rewrite test/Integer tests to use FileCheck instead of grep llvm-svn: 179047	2013-04-08 20:18:15 +00:00
Eli Bendersky	ed61b06fa8	Rewrite test/ExecutionEngine tests to use FileCheck instead of grep llvm-svn: 179043	2013-04-08 19:51:36 +00:00
Matt Arsenault	38b9e136ec	Update documentation. First feature is not CPU subtype anymore since r134127 llvm-svn: 179038	2013-04-08 18:52:58 +00:00
Eli Bendersky	aa3ffafbde	Rewrite test/Verifier tests to use FileCheck instead of grep llvm-svn: 179036	2013-04-08 18:33:51 +00:00
Arnold Schwaighofer	f47d2d7f6b	X86 cost model: Model cost for uitofp and sitofp on SSE2 The costs are overfitted so that I can still use the legalization factor. For example the following kernel has about half the throughput vectorized than unvectorized when compiled with SSE2. Before this patch we would vectorize it. unsigned short A[1024]; double B[1024]; void f() { int i; for (i = 0; i < 1024; ++i) { B[i] = (double) A[i]; } } radar://13599001 llvm-svn: 179033	2013-04-08 18:05:48 +00:00
Chad Rosier	fce4fab1a4	[ms-inline asm] Add support for ImmDisp [ Symbol ] memory operands. rdar://13521249 llvm-svn: 179030	2013-04-08 17:43:47 +00:00
Hal Finkel	b5aa7e54d9	Generate PPC early conditional returns PowerPC has a conditional branch to the link register (return) instruction: BCLR. This should be used any time when we'd otherwise have a conditional branch to a return. This adds a small pass, PPCEarlyReturn, which runs just prior to the branch selection pass (and, importantly, after block placement) to generate these conditional returns when possible. It will also eliminate unconditional branches to returns (these happen rarely; most of the time these have already been tail duplicated by the time PPCEarlyReturn is invoked). This is a nice optimization for small functions that do not maintain a stack frame. llvm-svn: 179026	2013-04-08 16:24:03 +00:00
Alexey Samsonov	c03f2ee0ae	DWARF parser: remove duplicated code and fix code style in DIE extractors. llvm-svn: 179023	2013-04-08 14:37:16 +00:00
Rafael Espindola	d66c414619	Add all 4 MachO object types. Use the stored type to implement is64Bits(). llvm-svn: 179021	2013-04-08 13:25:33 +00:00
Vincent Lejeune	5f11dd390a	R600: Control Flow support for pre EG gen llvm-svn: 179020	2013-04-08 13:05:49 +00:00
Chandler Carruth	a6d5e3e9a2	Simplify the quoting here. Our lit emulator doesn't deal well with the nested quoting schemes, and they're not important here... llvm-svn: 179014	2013-04-08 10:07:50 +00:00
Chandler Carruth	3fa99abfae	Remove a global 'endl' variable from the other file as well. llvm-svn: 179010	2013-04-08 08:55:18 +00:00
Chandler Carruth	1819289607	Clean up namespaces in obj2yaml.cpp. llvm-svn: 179009	2013-04-08 08:55:14 +00:00
Tim Northover	85c19f5a73	Add ACLE link to ARM documentation sections llvm-svn: 179006	2013-04-08 08:42:24 +00:00
Tim Northover	15410e98d3	AArch64: remove barriers from AArch64 atomic operations. I've managed to convince myself that AArch64's acquire/release instructions are sufficient to guarantee C++11's required semantics, even in the sequentially-consistent case. llvm-svn: 179005	2013-04-08 08:40:41 +00:00
Chandler Carruth	c224e25d49	Cleanup the formatting of obj2yaml.cpp. I couldn't touch this file and not clean it up some. These reformattings brought to you by clang-format, with some minor adjustments by me. More spring cleaning to follow here. llvm-svn: 179004	2013-04-08 08:39:59 +00:00
Chandler Carruth	741c00df17	Don't define our own global 'endl' variable. While technically it had internal linkage and so wasn't a patent bug, it doesn't make any sense here. We can avoid even calling operator<< by just embedding the newline in the string literals that were already being streamed out. It also gives the impression of some line-ending agnosticisms which is not present, and that flushing happens when it doesn't. If we want to use std::endl, we could do that, but honestly it doesn't seem remotely worth it. Using '\n' directly is much more clear when working with raw_ostream. It also happens to fix builds with old crufty GCC STL implementations that include std::endl into the global namespace (or headers written to be compatible with such atrocities). llvm-svn: 179003	2013-04-08 08:30:47 +00:00
Benjamin Kramer	d56a324e30	ARM: Remove unused variable. llvm-svn: 179001	2013-04-08 08:07:35 +00:00
Hal Finkel	81f8799fe3	Cleanup and improve PPC fsel generation First, we should not cheat: fsel-based lowering of select_cc is a finite-math-only optimization (the ISA manual, section F.3 of v2.06, makes this clear, as does a note in our own README). This also adds fsel-based lowering of EQ and NE condition codes. As it turned out, fsel generation was covered by a grand total of zero regression test cases. I've added some test cases to cover the existing behavior (which is now finite-math only), as well as the new EQ cases. llvm-svn: 179000	2013-04-07 22:11:09 +00:00
Arnold Schwaighofer	995ce6c388	TargetLowering: Fix getTypeConversion handling of extended vector types The code in getTypeConversion attempts to promote the element vector type before it trys to split or widen the vector. After it failed finding a legal vector type by promoting it would continue using the promoted vector element type. Thereby missing legal splitted vector types. For example the type v32i32 that has a legal split of 4 x v3i32 on x86/sse2 would be transformed to: v32i256 and from there on successively split to: v16i256, v8i256, v1i256 and then finally ends up as an i64 type. By resetting the vector element type to the original vector element type that existed before the promotion the code will attempt to split the vector type to smaller vector widths of the same type. llvm-svn: 178999	2013-04-07 20:22:56 +00:00
Rafael Espindola	421305aff8	Make MachOObjectFile independent from MachOObject. llvm-svn: 178998	2013-04-07 20:01:29 +00:00
Rafael Espindola	c1f28b6a8e	Implement MachOObjectFile::getData directly. llvm-svn: 178997	2013-04-07 19:42:15 +00:00

1 2 3 4 5 ...

90939 Commits