llvm-project

Commit Graph

Author	SHA1	Message	Date
Arnold Schwaighofer	3be40b56c5	Loop Vectorizer: Refactor code to compute vectorized memory instruction cost Introduce a helper class that computes the cost of memory access instructions. No functionality change intended. llvm-svn: 174422	2013-02-05 18:46:41 +00:00
Arnold Schwaighofer	22174f5d5a	Loop Vectorizer: Handle pointer stores/loads in getWidestType() In the loop vectorizer cost model, we used to ignore stores/loads of a pointer type when computing the widest type within a loop. This meant that if we had only stores/loads of pointers in a loop we would return a widest type of 8bits (instead of 32 or 64 bit) and therefore a vector factor that was too big. Now, if we see a consecutive store/load of pointers we use the size of a pointer (from data layout). This problem occured in SingleSource/Benchmarks/Shootout-C++/hash.cpp (reduced test case is the first test in vector_ptr_load_store.ll). radar://13139343 llvm-svn: 174377	2013-02-05 15:08:02 +00:00
Pekka Jaaskelainen	f50ab84bb1	LoopVectorize: convert TinyTripCountVectorThreshold constant to a command line switch. llvm-svn: 173837	2013-01-29 21:42:08 +00:00
Benjamin Kramer	cf406756ce	LoopVectorize: Clean up ValueMap a bit and avoid double lookups. No intended functionality change. llvm-svn: 173809	2013-01-29 17:31:33 +00:00
Renato Golin	1258519674	Vectorization Factor clarification llvm-svn: 173691	2013-01-28 16:02:45 +00:00
Hal Finkel	293a41d14f	BBVectorize: Better use of TTI->getShuffleCost When flipping the pair of subvectors that form a vector, if the vector length is 2, we can use the SK_Reverse shuffle kind to get more-accurate cost information. Also we can use the SK_ExtractSubvector shuffle kind to get accurate subvector extraction costs. The current cost model implementations don't yet seem complex enough for this to make a difference (thus, there are no test cases with this commit), but it should help in future. Depending on how the various targets optimize and combine shuffles in practice, we might be able to get more-accurate costs by combining the costs of multiple shuffle kinds. For example, the cost of flipping the subvector pairs could be modeled as two extractions and two subvector insertions. These changes, however, should probably be motivated by specific test cases. llvm-svn: 173621	2013-01-27 20:07:01 +00:00
Hal Finkel	2d443e94b4	BBVectorize: Add a additional comment about the cost computation llvm-svn: 173580	2013-01-26 16:49:04 +00:00
Hal Finkel	351a75b6d7	BBVectorize: Fix anomalous capital letter in comment llvm-svn: 173579	2013-01-26 16:49:03 +00:00
Nadav Rotem	69a040d3eb	LoopVectorize: Refactor the code that vectorizes loads/stores to remove duplication. llvm-svn: 173500	2013-01-25 21:47:42 +00:00
Benjamin Kramer	21e8da5990	LoopVectorize: Simplify code. No functionality change. llvm-svn: 173475	2013-01-25 19:43:15 +00:00
Nadav Rotem	8e9ca2f8cb	LoopVectorizer: Refactor more code to use the IRBuilder. llvm-svn: 173471	2013-01-25 19:26:23 +00:00
Nadav Rotem	c8adf3ff6e	Refactor some code to use the IRBuilder. llvm-svn: 173467	2013-01-25 18:34:09 +00:00
Nadav Rotem	ab3e698ee9	Add support for reverse pointer induction variables. These are loops that contain pointers that count backwards. For example, this is the hot loop in BZIP: do { m = --p; p = ( ... ); } while (--n); llvm-svn: 173219	2013-01-23 01:35:00 +00:00
Nadav Rotem	b2e7e7a0b6	Fix a comment. Induction vars dont need to start at zero. llvm-svn: 173061	2013-01-21 17:59:18 +00:00
Benjamin Kramer	a6e2e2a0a7	LoopVectorize: Fix a C++11 incompatibility. llvm-svn: 172990	2013-01-20 20:29:52 +00:00
Nadav Rotem	da9f2adffd	Fix a build error. llvm-svn: 172971	2013-01-20 09:39:17 +00:00
Nadav Rotem	c42f90b1f4	LoopVectorizer: Implement a new heuristics for selecting the unroll factor. We ignore the cpu frontend and focus on pipeline utilization. We do this because we don't have a good way to estimate the loop body size at the IR level. llvm-svn: 172964	2013-01-20 05:24:29 +00:00
Benjamin Kramer	d455ed85d1	LoopVectorizer: Emit memory checks into their own basic block. This separates the check for "too few elements to run the vector loop" from the "memory overlap" check, giving a lot nicer code and allowing to skip the memory checks when we're not going to execute the vector code anyways. We still leave the decision of whether to emit the memory checks as branches or setccs, but it seems to be doing a good job. If ugly code pops up we may want to emit them as separate blocks too. Small speedup on MultiSource/Benchmarks/MallocBench/espresso. Most of this is legwork to allow multiple bypass blocks while updating PHIs, dominators and loop info. llvm-svn: 172902	2013-01-19 13:57:58 +00:00
Nadav Rotem	d33ce6f100	LoopVectorizer cost model. Honor the user command line flag that selects the vectorization factor even if the target machine does not have any vector registers. llvm-svn: 172544	2013-01-15 18:25:16 +00:00
Nadav Rotem	40e45eeae2	Fix PR14547. Handle induction variables of small sizes smaller than i32 (i8 and i16). llvm-svn: 172348	2013-01-13 07:56:29 +00:00
Nadav Rotem	853fe0acb9	ARM Cost Model: We need to detect the max bitwidth of types in the loop in order to select the max vectorization factor. We don't have a detailed analysis on which values are vectorized and which stay scalars in the vectorized loop so we use another method. We look at reduction variables, loads and stores, which are the only ways to get information in and out of loop iterations. If the data types are extended and truncated then the cost model will catch the cost of the vector zext/sext/trunc operations. llvm-svn: 172178	2013-01-11 07:11:59 +00:00
Nadav Rotem	6eae65cfac	LoopVectorizer: Fix a bug in the vectorization of BinaryOperators. The BinaryOperator can be folded to an Undef, and we don't want to set NSW flags to undef vals. PR14878 llvm-svn: 172079	2013-01-10 17:34:39 +00:00
Nadav Rotem	b1791a75cd	ARM Cost model: Use the size of vector registers and widest vectorizable instruction to determine the max vectorization factor. llvm-svn: 172010	2013-01-09 22:29:00 +00:00
Nadav Rotem	b696c36fcd	Cost Model: Move the 'max unroll factor' variable to the TTI and add initial Cost Model support on ARM. llvm-svn: 171928	2013-01-09 01:15:42 +00:00
Nadav Rotem	3c352c0f4a	Code cleanup: refactor the switch statements in the generation of reduction variables into an IR builder call. llvm-svn: 171871	2013-01-08 17:37:45 +00:00
Nadav Rotem	6f6d21a17b	Rename the enum members to match the LLVM coding style. llvm-svn: 171868	2013-01-08 17:23:17 +00:00
Nadav Rotem	5a197c06f3	LoopVectorizer: Add support for floating point reductions llvm-svn: 171812	2013-01-07 23:13:00 +00:00
Nadav Rotem	c60d7d96f5	LoopVectorizer: When we vectorizer and widen loops we process many elements at once. This is a good thing, except for small loops. On small loops post-loop that handles scalars (and runs slower) can take more time to execute than the rest of the loop. This patch disables widening of loops with a small static trip count. llvm-svn: 171798	2013-01-07 21:54:51 +00:00
Chandler Carruth	b348328b5d	Simplify LoopVectorize to require target transform info and rely on it being present. Make a member of one of the helper classes a reference as part of this. Reformatting goodness brought to you by clang-format. llvm-svn: 171726	2013-01-07 11:12:29 +00:00
Chandler Carruth	b7e60f6844	Merge the unused header file for LoopVectorizer into the source file. This makes the loop vectorizer match the pattern followed by roughly all other passses. =] Notably, this header file was braken in several regards: it contained a using namespace directive, global #define's that aren't globaly appropriate, and global constants defined directly in the header file. As a side benefit, lots of the types in this file become internal, which will cause the optimizer to chew on this pass more effectively. llvm-svn: 171723	2013-01-07 10:44:06 +00:00
Chandler Carruth	7383bfd67e	Switch BBVectorize to directly depend on having a TTI analysis. This could be simplified further, but Hal has a specific feature for ignoring TTI, and so I preserved that. Also, I needed to use it because a number of tests fail when switching from a null TTI to the NoTTI nonce implementation. That seems suspicious to me and so may be something that you need to look into Hal. I worked it by preserving the old behavior for these tests with the flag that ignores all target info. llvm-svn: 171722	2013-01-07 10:22:36 +00:00
Chandler Carruth	04ece8623e	Fix a slew of indentation and parameter naming style issues. This 80% of this patch brought to you by the tool clang-format. I wanted to fix up the names of constructor parameters because they followed a bit of an anti-pattern by naming initialisms with CamelCase: 'Tti', 'Se', etc. This appears to have been in an attempt to not overlap with the names of member variables 'TTI', 'SE', etc. However, constructor arguments can very safely alias members, and in fact that's the conventional way to pass in members. I've fixed all of these I saw, along with making some strang abbreviations such as 'Lp' be simpler 'L', or 'Lgl' be the word 'Legal'. However, the code I was touching had indentation and formatting somewhat all over the map. So I ran clang-format and fixed them. I also fixed a few other formatting or doxygen formatting issues such as using ///< on trailing comments so they are associated with the correct entry. There is still a lot of room for improvement of the formating and cleanliness of this code. ;] At least a few parts of the coding standards or common practices in LLVM's code aren't followed, the enum naming rules jumped out at me. I may mix some of these while I'm here, but not all of them. llvm-svn: 171719	2013-01-07 09:57:00 +00:00
Chandler Carruth	2109f47d97	Fix the enumerator names for ShuffleKind to match tho coding standards, and make its comments doxygen comments. llvm-svn: 171688	2013-01-07 03:20:02 +00:00
Chandler Carruth	d3e73556d6	Move TargetTransformInfo to live under the Analysis library. This no longer would violate any dependency layering and it is in fact an analysis. =] llvm-svn: 171686	2013-01-07 03:08:10 +00:00
Chandler Carruth	21b3c586ab	Switch the loop vectorizer from VTTI to just use TTI directly. llvm-svn: 171620	2013-01-05 10:16:02 +00:00
Chandler Carruth	7c4f91dea5	Switch the BB vectorizer from the VTTI interface to the simple TTI interface. llvm-svn: 171618	2013-01-05 10:05:28 +00:00
Nadav Rotem	e9f5bfd5e9	iLoopVectorize: Non commutative operators can be used as reduction variables as long as the reduction chain is used in the LHS. PR14803. llvm-svn: 171583	2013-01-05 01:15:47 +00:00
Paul Redmond	874f01e956	Do not vectorize loops with subtraction reductions Since subtraction does not commute the loop vectorizer incorrectly vectorizes reductions such as x = A[i] - x. Disabling for now. llvm-svn: 171537	2013-01-04 22:10:16 +00:00
Nadav Rotem	93bd30be9b	Fix a warning llvm-svn: 171525	2013-01-04 21:08:44 +00:00
Nadav Rotem	e1d5c4b8b9	LoopVectorizer: 1. Add code to estimate register pressure. 2. Add code to select the unroll factor based on register pressure. 3. Add bits to TargetTransformInfo to provide the number of registers. llvm-svn: 171469	2013-01-04 17:48:25 +00:00
Nadav Rotem	72f984b596	LoopVectorizer: Add support for loop-unrolling during vectorization for increasing the ILP. At the moment this feature is disabled by default and this commit should not cause any functional changes. llvm-svn: 171436	2013-01-03 00:52:27 +00:00
Nadav Rotem	4897392360	Avoid vectorization when the function has the "noimplicitflot" attribute. llvm-svn: 171429	2013-01-02 23:54:43 +00:00
Chandler Carruth	9fb823bbd4	Move all of the header files which are involved in modelling the LLVM IR into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366	2013-01-02 11:36:10 +00:00
Benjamin Kramer	614b5e85b9	Add IRBuilder::CreateVectorSplat and use it to simplify code. llvm-svn: 171349	2013-01-01 19:55:16 +00:00
Bill Wendling	698e84fc4f	Remove the Function::getFnAttributes method in favor of using the AttributeSet directly. This is in preparation for removing the use of the 'Attribute' class as a collection of attributes. That will shift to the AttributeSet class instead. llvm-svn: 171253	2012-12-30 10:32:01 +00:00
Nadav Rotem	0b37f14371	LoopVectorizer: Fix a bug in the code that updates the loop exiting block. LCSSA PHIs may have undef values. The vectorizer updates values that are used by outside users such as PHIs. The bug happened because undefs are not loop values. This patch handles these PHIs. PR14725 llvm-svn: 171251	2012-12-30 07:47:00 +00:00
Nadav Rotem	5350cd314b	If all of the write objects are identified then we can vectorize the loop even if the read objects are unidentified. PR14719. llvm-svn: 171124	2012-12-26 23:30:53 +00:00
Nadav Rotem	3f7c4f36ba	LoopVectorizer: Optimize the vectorization of consecutive memory access when the iteration step is -1 llvm-svn: 171114	2012-12-26 19:08:17 +00:00
Hal Finkel	30e95a8ebb	BBVectorize: Use VTTI to compute costs for intrinsics vectorization For the time being this includes only some dummy test cases. Once the generic implementation of the intrinsics cost function does something other than assuming scalarization in all cases, or some target specializes the interface, some real test cases can be added. Also, for consistency, I changed the type of IID from unsigned to Intrinsic::ID in a few other places. llvm-svn: 171079	2012-12-26 01:36:57 +00:00
Hal Finkel	b44f890133	LoopVectorize: Enable vectorization of the fmuladd intrinsic llvm-svn: 171076	2012-12-25 23:21:29 +00:00
Hal Finkel	2a456112ec	BBVectorize: Enable vectorization of the fmuladd intrinsic llvm-svn: 171075	2012-12-25 22:36:08 +00:00
Nadav Rotem	5f7c12cfbd	LoopVectorizer: When checking for vectorizable types, also check the StoreInst operands. PR14705. llvm-svn: 171023	2012-12-24 09:14:18 +00:00
Nadav Rotem	bd5d1d832a	LoopVectorizer: Fix an endless loop in the code that looks for reductions. The bug was in the code that detects PHIs in if-then-else block sequence. PR14701. llvm-svn: 171008	2012-12-24 01:22:06 +00:00
Benjamin Kramer	28691400dd	LoopVectorize: Fix accidentaly inverted condition. llvm-svn: 171001	2012-12-23 13:21:41 +00:00
Benjamin Kramer	855ba03408	LoopVectorize: For scalars and void types there is no need to compute vector insert/extract costs. Fixes an assert during the build of oggenc in the test suite. llvm-svn: 171000	2012-12-23 13:19:18 +00:00
Nadav Rotem	2cade68025	Loop Vectorizer: Update the cost model of scatter/gather operations and make them more expensive. llvm-svn: 170995	2012-12-23 07:23:55 +00:00
Bill Wendling	c79e42c5ce	Change 'AttrVal' to 'AttrKind' to better reflect that it's a kind of attribute instead of the value of the attribute. llvm-svn: 170972	2012-12-22 00:37:52 +00:00
Roman Divacky	a229186a82	Remove duplicate includes. llvm-svn: 170902	2012-12-21 17:06:44 +00:00
Nadav Rotem	3b850b70b3	Enable if-conversion. llvm-svn: 170841	2012-12-21 04:47:54 +00:00
Nadav Rotem	a4b53f20a3	BB-Vectorizer: Check the cost of the store pointer type and not the return type, which is void. A number of test cases fail after adding the assertion in TTImpl. llvm-svn: 170828	2012-12-21 01:24:36 +00:00
Nadav Rotem	e7785686a5	Fix a bug in the code that checks if we can vectorize loops while using dynamic memory bound checks. Before the fix we were able to vectorize this loop from the Livermore Loops benchmark: for ( k=1 ; k<n ; k++ ) x[k] = x[k-1] + y[k]; llvm-svn: 170811	2012-12-21 00:07:35 +00:00
Nadav Rotem	2ababf68d7	LoopVectorize: Fix a bug in the scalarization of instructions. Before if-conversion we could check if a value is loop invariant if it was declared inside the basic block. Now that loops have multiple blocks this check is incorrect. This fixes External/SPEC/CINT95/099_go/099_go llvm-svn: 170756	2012-12-20 20:24:40 +00:00
Nadav Rotem	8b20c0a814	Loop Vectorizer: turn-off if-conversion. llvm-svn: 170708	2012-12-20 17:42:53 +00:00
Nadav Rotem	7bdc45b570	Loop Vectorizer: Enable if-conversion. llvm-svn: 170632	2012-12-20 02:00:02 +00:00
Nadav Rotem	28408a20c9	whitespace llvm-svn: 170626	2012-12-20 00:49:56 +00:00
Benjamin Kramer	e300004bd5	LoopVectorize: Make iteration over induction variables not depend on pointer values. MapVector is a bit heavyweight, but I don't see a simpler way. Also the InductionList is unlikely to be large. This should help 3-stage selfhost compares (PR14647). llvm-svn: 170528	2012-12-19 11:09:15 +00:00
Bill Wendling	3d7b0b8ac7	Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future. llvm-svn: 170502	2012-12-19 07:18:57 +00:00
Benjamin Kramer	f0e5d2f032	LoopVectorize: Emit reductions as log2(vectorsize) shuffles + vector ops instead of scalar operations. For example on x86 with SSE4.2 a <8 x i8> add reduction becomes movdqa %xmm0, %xmm1 movhlps %xmm1, %xmm1 ## xmm1 = xmm1[1,1] paddw %xmm0, %xmm1 pshufd $1, %xmm1, %xmm0 ## xmm0 = xmm1[1,0,0,0] paddw %xmm1, %xmm0 phaddw %xmm0, %xmm0 pextrb $0, %xmm0, %edx instead of pextrb $2, %xmm0, %esi pextrb $0, %xmm0, %edx addb %sil, %dl pextrb $4, %xmm0, %esi addb %dl, %sil pextrb $6, %xmm0, %edx addb %sil, %dl pextrb $8, %xmm0, %esi addb %dl, %sil pextrb $10, %xmm0, %edi pextrb $14, %xmm0, %edx addb %sil, %dil pextrb $12, %xmm0, %esi addb %dil, %sil addb %sil, %dl llvm-svn: 170439	2012-12-18 18:40:20 +00:00
Nadav Rotem	e5e28b48c8	Enable the Loop Vectorizer by default for O2 and O3. Disable if-conversion by default. I plan to revert this patch later today. llvm-svn: 170157	2012-12-13 23:11:54 +00:00
Nadav Rotem	36510f7194	Teach the cost model about the optimization in r169904: Truncation of induction variables costs the same as scalar trunc. llvm-svn: 170051	2012-12-13 00:21:03 +00:00
Nadav Rotem	6027bdf898	Fix indentation. llvm-svn: 170005	2012-12-12 19:39:36 +00:00
Nadav Rotem	d0bb22bba3	LoopVectorizer: Use the "optsize" attribute to decide if we are allowed to increase the function size. llvm-svn: 170004	2012-12-12 19:29:45 +00:00
Nadav Rotem	6798a04b15	Fix the ascii drawing that was ruined when I split the H and CPP llvm-svn: 169955	2012-12-12 01:33:47 +00:00
Nadav Rotem	4fa2e3d5af	fix a typo. llvm-svn: 169953	2012-12-12 01:31:10 +00:00
Nadav Rotem	aeb17df802	LoopVectorizer: When -Os is used, vectorize only loops that dont require a tail loop. There is no testcase because I dont know of a way to initialize the loop vectorizer pass without adding an additional hidden flag. llvm-svn: 169950	2012-12-12 01:11:46 +00:00
Nadav Rotem	f707bf4ca3	PR14574. Fix a bug in the code that calculates the mask the converted PHIs in if-conversion. llvm-svn: 169916	2012-12-11 21:30:14 +00:00
Nadav Rotem	e266efb70b	Loop Vectorize: optimize the vectorization of trunc(induction_var). The truncation is now done on scalars. llvm-svn: 169904	2012-12-11 18:58:10 +00:00
Nadav Rotem	dbb3328194	Fix PR14565. Don't if-convert loops that have switch statements in them. llvm-svn: 169813	2012-12-11 04:55:10 +00:00
Nadav Rotem	07df5ac1a1	Split the LoopVectorizer into H and CPP. llvm-svn: 169771	2012-12-10 21:39:02 +00:00
Nadav Rotem	7b5b55c195	Add support for reverse induction variables. For example: while (i--) sum+=A[i]; llvm-svn: 169752	2012-12-10 19:25:06 +00:00
Paul Redmond	2adb13c100	LoopVectorize: support vectorizing intrinsic calls - added function to VectorTargetTransformInfo to query cost of intrinsics - vectorize trivially vectorizable intrinsic calls such as sin, cos, log, etc. Reviewed by: Nadav llvm-svn: 169711	2012-12-09 20:42:17 +00:00
Paul Redmond	f7cd6b391a	test commit. llvm-svn: 169709	2012-12-09 19:46:31 +00:00
Nadav Rotem	a8f026e2d4	LoopVectorizer: Increase the number of pointers that can be tested at runtime. If we cant prove statically that the pointers are disjoint then we add the runtime check. llvm-svn: 169334	2012-12-04 23:25:24 +00:00
Nadav Rotem	87fc988c5d	Enable if-conversion during vectorization. llvm-svn: 169331	2012-12-04 22:59:52 +00:00
Nadav Rotem	93fa5ef957	Fix a bug in vectorization of if-converted reduction variables. If the reduction variable is not used outside the loop then we ran into an endless loop. This change checks if we found the original PHI. llvm-svn: 169324	2012-12-04 22:40:22 +00:00
Nadav Rotem	a10b311aec	Add support for reduction variables when IF-conversion is enabled. llvm-svn: 169288	2012-12-04 18:17:33 +00:00
Nadav Rotem	07674cb566	Give scalar if-converted blocks half the score because they are not always executed due to CF. llvm-svn: 169223	2012-12-04 07:11:52 +00:00
Nadav Rotem	628c2dba60	Add the last part that is needed for vectorization of if-converted code. Added the code that actually performs the if-conversion during vectorization. We can now vectorize this code: for (int i=0; i<n; ++i) { unsigned k = 0; if (a[i] > b[i]) <------ IF inside the loop. k = k * 5 + 3; a[i] = k; <---- K is a phi node that becomes vector-select. } llvm-svn: 169217	2012-12-04 06:15:11 +00:00
NAKAMURA Takumi	f99b535fdb	LoopVectorize.cpp: Suppress a warning. [-Wunused-variable] llvm-svn: 169195	2012-12-04 00:49:34 +00:00
NAKAMURA Takumi	8b07bc579b	Fix whitespace. llvm-svn: 169194	2012-12-04 00:49:28 +00:00
Nadav Rotem	d479a57f68	minor renaming, documentation and cleanups. llvm-svn: 169175	2012-12-03 22:57:09 +00:00
Nadav Rotem	fad16be973	IF-conversion: teach the cost-model how to grade if-converted loops. llvm-svn: 169171	2012-12-03 22:46:31 +00:00
Nadav Rotem	eee203d885	Now that we have a basic if-conversion infrastructure we can rename the "single basic block loop vectorizer" to "innermost loop vectorizer". llvm-svn: 169158	2012-12-03 21:33:08 +00:00
Nadav Rotem	a30aba7a01	Add initial support for IF-conversion. This patch implements the first 1/3, which is the legality of the if-conversion transformation. The next step is to implement the cost-model for the if-converted code as well as the vectorization itself. llvm-svn: 169152	2012-12-03 21:06:35 +00:00
Chandler Carruth	ed0881b2a6	Use the new script to sort the includes of every file under lib. Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131	2012-12-03 16:50:05 +00:00
Nadav Rotem	3ae24ee08a	minor cleanups llvm-svn: 169048	2012-11-30 22:37:11 +00:00
Nadav Rotem	6b494be886	Remove the use of LPPassManager. We can remove LPM because we dont need to run any additional loop passes on the new vector loop. llvm-svn: 169016	2012-11-30 17:27:53 +00:00
Nadav Rotem	8dd6ee8df5	When broadcasting invariant scalars into vectors, place the broadcast code in the preheader. llvm-svn: 168927	2012-11-29 19:25:41 +00:00
Hal Finkel	88ee6b0082	BBVectorize: Correctly merge SubclassOptionalData When two instructions are combined into a vector instruction, the resulting instruction must have the most-conservative flags. llvm-svn: 168765	2012-11-28 03:04:10 +00:00
Nadav Rotem	caf5acfd14	Move the code that uses SCEVs prior to creating the new loops. llvm-svn: 168601	2012-11-26 19:51:46 +00:00
Nadav Rotem	ee7ede76f4	Move the max vector width to a constant parameter. No functionality change. llvm-svn: 168570	2012-11-25 16:48:08 +00:00
Nadav Rotem	ef33b5076c	Fix the document style. llvm-svn: 168569	2012-11-25 16:39:01 +00:00
Nadav Rotem	12192f19eb	Refactor the ptr runtime check generation code. No functionality change. llvm-svn: 168568	2012-11-25 16:27:16 +00:00
Nadav Rotem	b15d9fe24d	Rename method. No functionality change. llvm-svn: 168560	2012-11-25 09:13:57 +00:00
Nadav Rotem	bf5173460f	The induction-pointer work is inspired by a research paper. This commit adds a reference. llvm-svn: 168559	2012-11-25 09:09:26 +00:00
Nadav Rotem	ea3824f160	Add support for pointer induction variables even when there is no integer induction variable. llvm-svn: 168558	2012-11-25 08:41:35 +00:00
Nadav Rotem	c3c07e62e8	LoopVectorizer: Add initial support for pointer induction variables (for example: dst++ = src++). At the moment we still require to have an integer induction variable (for example: i++). llvm-svn: 168231	2012-11-17 00:27:03 +00:00
Nadav Rotem	0565b5a279	LoopVectorize: Division reductions generate incorrect code. Remove the part of the code that deals with divs. Thanks to Paul Redmond for catching this while reviewing the code. llvm-svn: 168142	2012-11-16 06:51:17 +00:00
Hal Finkel	e9740a4692	Replace std::vector -> SmallVector in BBVectorize For now, this uses 8 on-stack elements. I'll need to do some profiling to see if this is the best number. Pointed out by Jakob in post-commit review. llvm-svn: 167966	2012-11-14 19:53:27 +00:00
Hal Finkel	1b7f0aba48	Fix the largest offender of determinism in BBVectorize Iterating over the children of each node in the potential vectorization plan must happen in a deterministic order (because it affects which children are erased when two children conflict). There was no need for this data structure to be a map in the first place, so replacing it with a vector is a small change. I believe that this was the last remaining instance if iterating over the elements of a Dense* container where the iteration order could matter. There are some remaining iterations over std::map containers where the order might matter, but so long as the Value for instructions in a block increase with the order of the instructions in the block (or decrease) monotonically, then this will appear to be deterministic. llvm-svn: 167942	2012-11-14 18:38:11 +00:00
Nadav Rotem	a43bcddc8d	use the getSplat API. Patch by Paul Redmond. llvm-svn: 167892	2012-11-14 00:02:13 +00:00
Hal Finkel	b51bdd20d3	BBVectorize: Remove temporary assert used for debugging llvm-svn: 167817	2012-11-13 05:54:54 +00:00
Hal Finkel	2a1df367d4	BBVectorize: Don't vectorize vector-manipulation chains Don't choose a vectorization plan containing only shuffles and vector inserts/extracts. Due to inperfections in the cost model, these can lead to infinite recusion. llvm-svn: 167811	2012-11-13 03:12:40 +00:00
Hal Finkel	3b79f55c5f	BBVectorize: Only some insert element operand pairs are free. This fixes another infinite recursion case when using target costs. We can only replace insert element input chains that are pure (end with inserting into an undef). llvm-svn: 167784	2012-11-12 23:55:36 +00:00
Hal Finkel	9cf3372931	BBVectorize: Use a more sophisticated check for input cost The old checking code, which assumed that input shuffles and insert-elements could always be folded (and thus were free) is too simple. This can only happen in special circumstances. Using the simple check caused infinite recursion. llvm-svn: 167750	2012-11-12 21:21:02 +00:00
Hal Finkel	f8326b6052	BBVectorize: Check the types of compare instructions The pass would previously assert when trying to compute the cost of compare instructions with illegal vector types (like struct pointers). llvm-svn: 167743	2012-11-12 19:41:38 +00:00
Hal Finkel	ef53df0f9f	BBVectorize: Check the input types of shuffles for legality This fixes a bug where shuffles were being fused such that the resulting input types were not legal on the target. This would occur only when both inputs and dependencies were also foldable operations (such as other shuffles) and there were other connected pairs in the same block. llvm-svn: 167731	2012-11-12 14:50:59 +00:00
Nadav Rotem	12930749ab	Fix a comment typo and add comments. llvm-svn: 167684	2012-11-11 05:15:00 +00:00
Nadav Rotem	1cfef3e9ee	Add support for memory runtime check. When we can, we calculate array bounds. If the arrays are found to be disjoint then we run the vectorized version of the loop. If they are not, we run the scalar code. llvm-svn: 167608	2012-11-09 07:09:44 +00:00
Chandler Carruth	acc748b2b5	Fix sign compare warning. Patch by Mahesha HS. llvm-svn: 167282	2012-11-02 05:24:00 +00:00
Hal Finkel	560545b85f	BBVectorize: Use target costs for incoming and outgoing values instead of the depth heuristic. When target cost information is available, compute explicit costs of inserting and extracting values from vectors. At this point, all costs are estimated using the target information, and the chain-depth heuristic is not needed. As a result, it is now, by default, disabled when using target costs. llvm-svn: 167256	2012-11-01 21:50:12 +00:00
Hal Finkel	c89e75e93e	BBVectorize: Account for internal shuffle costs When target costs are available, use them to account for the costs of shuffles on internal edges of the DAG of candidate pairs. Because the shuffle costs here are currently for only the internal edges, the current target cost model is trivial, and the chain depth requirement is still in place, I don't yet have an easy test case. Nevertheless, by looking at the debug output, it does seem to do the right think to the effective "size" of each DAG of candidate pairs. llvm-svn: 167217	2012-11-01 06:26:34 +00:00
Nadav Rotem	4cb8cdab5e	LoopVectorize: Preserve NSW, NUW and IsExact flags. llvm-svn: 167174	2012-10-31 21:40:39 +00:00
Nadav Rotem	ec3ab49dda	Put the threshold magic number in a variable. llvm-svn: 167134	2012-10-31 16:22:16 +00:00
Nadav Rotem	1265ea8f8d	Remove enum values since they are not used anymore. llvm-svn: 167131	2012-10-31 16:14:06 +00:00
Hal Finkel	842ad0b621	BBVectorize: Choose pair ordering to minimize shuffles BBVectorize would, except for loads and stores, always fuse instructions so that the first instruction (in the current source order) would always represent the low part of the input vectors and the second instruction would always represent the high part. This lead to too many shuffles being produced because sometimes the opposite order produces fewer of them. With this change, BBVectorize tracks the kind of pair connections that form the DAG of candidate pairs, and uses that information to reorder the pairs to avoid excess shuffles. Using this information, a future commit will be able to add VTTI-based shuffle costs to the pair selection procedure. Importantly, the number of remaining shuffles can now be estimated during pair selection. There are some trivial instruction reorderings in the test cases, and one simple additional test where we certainly want to do a reordering to avoid an unnecessary shuffle. llvm-svn: 167122	2012-10-31 15:17:07 +00:00
Nadav Rotem	ce77ab0c24	LoopVectorize: Do not vectorize loops with tiny constant trip counts. llvm-svn: 167101	2012-10-31 03:31:07 +00:00
Nadav Rotem	ff7889196b	Add support for loops that don't start with Zero. This is important for loops in the LAPACK test-suite. These loops start at 1 because they are auto-converted from fortran. llvm-svn: 167084	2012-10-31 00:45:26 +00:00
Nadav Rotem	47a299dcc9	Add documentation. llvm-svn: 167055	2012-10-30 22:06:26 +00:00
Hal Finkel	08f34ac9dd	BBVectorize: Cache fixed-order pairs instead of recomputing pointer info. Instead of recomputing relative pointer information just prior to fusing, cache this information (which also needs to be computed during the candidate-pair selection process). This cuts down on the total number of SE queries made, and also is a necessary intermediate step on the road toward including shuffle costs in the pair selection procedure. No functionality change is intended. llvm-svn: 167049	2012-10-30 20:17:37 +00:00
Hal Finkel	2eaadd1a2d	BBVectorize: Fix a small bug introduced in r167042. We need to make sure that we take the correct load/store alignment when the inputs are flipped. llvm-svn: 167044	2012-10-30 19:47:37 +00:00
Hal Finkel	f384890961	BBVectorize: Simplify how input swapping is handled. Stop propagating the FlipMemInputs variable into the routines that create the replacement instructions. Instead, just flip the arguments of those routines. This allows for some associated cleanup (not all of which is done here). No functionality change is intended. llvm-svn: 167042	2012-10-30 19:35:29 +00:00
Hal Finkel	eac2887143	BBVectorize: Don't make calls to SE when the result is unused. SE was being called during the instruction-fusion process (when the result is unreliable, and thus ignored). No functionality change is intended. llvm-svn: 167037	2012-10-30 18:55:49 +00:00
Nadav Rotem	bc21aceb19	LoopVectorize: Add support for write-only loops when the write destination is a single pointer. Speedup SciMark by 1% llvm-svn: 167035	2012-10-30 18:36:45 +00:00
Nadav Rotem	b3e8e688da	LoopVectorize: Fix a bug in the initialization of reduction variables. AND needs to start at all-one while XOR, and OR need to start at zero. llvm-svn: 167032	2012-10-30 18:12:36 +00:00
Nadav Rotem	73ddcfe03f	LoopVectorizer: change debug prints: Print the module identifier when deciding to vectorize. When deciding not to vectorize do not print the called function name because it can be null. llvm-svn: 166989	2012-10-30 00:40:39 +00:00
Nadav Rotem	5ad045a8c5	LoopVectorize: Update and preserve the dominator tree info. llvm-svn: 166970	2012-10-29 21:52:38 +00:00
Hal Finkel	bad10bb2f3	Update BBVectorize to use the new VTTI instr. cost interfaces. The monolithic interface for instruction costs has been split into several functions. This is the corresponding change. No functionality change is intended. llvm-svn: 166865	2012-10-27 04:33:48 +00:00
Nadav Rotem	859366f93f	1. Fix a bug in getTypeConversion. When a simple type is split, we need to return the type of the split result. 2. Change the maximum vectorization width from 4 to 8. 3. A test for both. llvm-svn: 166864	2012-10-27 04:11:32 +00:00
Nadav Rotem	afae78edab	Refactor the VectorTargetTransformInfo interface. Add getCostXXX calls for different families of opcodes, such as casts, arithmetic, cmp, etc. Port the LoopVectorizer to the new API. The LoopVectorizer now finds instructions which will remain uniform after vectorization. It uses this information when calculating the cost of these instructions. llvm-svn: 166836	2012-10-26 23:49:28 +00:00
Hal Finkel	4863448dca	Use VTTI->getNumberOfParts in BBVectorize. This change reflects VTTI refactoring; no functionality change intended. llvm-svn: 166752	2012-10-26 04:28:06 +00:00
Hal Finkel	41a6ded4a0	Disable generation of pointer vectors by BBVectorize. Once vector-of-pointer support works, then this can be reverted. llvm-svn: 166741	2012-10-26 00:05:26 +00:00
Hal Finkel	20a49d6f2c	BBVectorize, when using VTTI, should not form types that will be split. This is needed so that perl's SHA can be compiled (otherwise BBVectorize takes far too long to find its fixed point). I'll try to come up with a reduced test case. llvm-svn: 166738	2012-10-25 23:47:16 +00:00
Hal Finkel	cbf9365f4c	Begin incorporating target information into BBVectorize. This is the first of several steps to incorporate information from the new TargetTransformInfo infrastructure into BBVectorize. Two things are done here: 1. Target information is used to determine if it is profitable to fuse two instructions. This means that the cost of the vector operation must not be more expensive than the cost of the two original operations. Pairs that are not profitable are no longer considered (because current cost information is incomplete, for intrinsics for example, equal-cost pairs are still considered). 2. The 'cost savings' computed for the profitability check are also used to rank the DAGs that represent the potential vectorization plans. Specifically, for nodes of non-trivial depth, the cost savings is used as the node weight. The next step will be to incorporate the shuffle costs into the DAG weighting; this will give the edges of the DAG weights as well. Once that is done, when target information is available, we should be able to dispense with the depth heuristic. llvm-svn: 166716	2012-10-25 21:12:23 +00:00
Nadav Rotem	579042f71b	LoopVectorize: Teach the cost model to query scalar costs as scalar types and not vectors of 1. llvm-svn: 166715	2012-10-25 21:03:48 +00:00
Nadav Rotem	5ffb049a55	Add support for additional reduction variables: AND, OR, XOR. Patch by Paul Redmond <paul.redmond@intel.com>. llvm-svn: 166649	2012-10-25 00:08:41 +00:00
Nadav Rotem	4a87683a41	Implement a basic cost model for vector and scalar instructions. llvm-svn: 166642	2012-10-24 23:47:38 +00:00
Nadav Rotem	e4f491e7ee	whitespace llvm-svn: 166622	2012-10-24 20:58:40 +00:00
Nadav Rotem	a721b21c64	LoopVectorizer: Add a basic cost model which uses the VTTI interface. llvm-svn: 166620	2012-10-24 20:36:32 +00:00
Micah Villmow	51e7246cb4	Back out r166591, not sure why this made it through since I cancelled the command. Bleh, sorry about this! llvm-svn: 166596	2012-10-24 17:25:11 +00:00
Micah Villmow	6a8f3f9e20	Delete a directory that wasn't supposed to be checked in yet. llvm-svn: 166591	2012-10-24 17:20:04 +00:00
Nadav Rotem	5bed7b4fad	Use the AliasAnalysis isIdentifiedObj because it also understands mallocs and c++ news. PR14158. llvm-svn: 166491	2012-10-23 18:44:18 +00:00
Nadav Rotem	1c7fc71e69	Don't crash if the load/store pointer is not a GEP. Fix by Shivarama Rao <Shivarama.Rao@amd.com> llvm-svn: 166427	2012-10-22 18:27:56 +00:00
Hal Finkel	931c52b84c	BBVectorize should ignore unreachable blocks. Unreachable blocks can have invalid instructions. For example, jump threading can produce self-referential instructions in unreachable blocks. Also, we should not be spending time optimizing unreachable code. Fixes PR14133. llvm-svn: 166423	2012-10-22 18:00:55 +00:00
Nadav Rotem	f17cd27362	Rename a variable. llvm-svn: 166410	2012-10-22 04:53:05 +00:00
Nadav Rotem	03011f1393	Vectorizer: optimize the generation of selects. If the condition is uniform, generate a scalar-cond select (i1 as selector). llvm-svn: 166409	2012-10-22 04:38:00 +00:00
Nadav Rotem	c9741887c3	Update the loop vectorizer docs. llvm-svn: 166408	2012-10-22 03:52:53 +00:00
Anders Carlsson	7d8991c778	Avoid an extra hash lookup when inserting a value into the widen map. llvm-svn: 166395	2012-10-21 16:26:35 +00:00
Jakub Staszak	baa063bd03	Simplify code. No functionality change. llvm-svn: 166393	2012-10-21 15:36:03 +00:00
Jakub Staszak	9694ab8ffa	Simplify code. No functionality change. llvm-svn: 166392	2012-10-21 15:29:19 +00:00
Nadav Rotem	fe88c67161	Fix a bug in the vectorization of wide load/store operations. We used a SCEV to detect that A[X] is consecutive. We assumed that X was the induction variable. But X can be any expression that uses the induction for example: X = i + 2; llvm-svn: 166388	2012-10-21 06:49:10 +00:00
Nadav Rotem	c1679a95b6	Add support for reduction variables that do not start at zero. This is important for nested-loop reductions such as : In the innermost loop, the induction variable does not start with zero: for (i = 0 .. n) for (j = 0 .. m) sum += ... llvm-svn: 166387	2012-10-21 05:52:51 +00:00
Nadav Rotem	364bd30641	Document change. Describe the pass and some papers that inspired the design of the pass. llvm-svn: 166386	2012-10-21 04:04:25 +00:00
Nadav Rotem	7e1084d36c	Vectorizer: fix a bug in the classification of induction/reduction phis. llvm-svn: 166384	2012-10-21 02:38:01 +00:00
Nadav Rotem	e5dc57d4fb	Fix an infinite loop in the loop-vectorizer. PR14134. llvm-svn: 166379	2012-10-20 20:45:01 +00:00
Nadav Rotem	d189b82a9b	Vectorize: teach cavVectorizeMemory to distinguish between A[i]+=x and A[B[i]]+=x. If the pointer is consecutive then it is safe to read and write. If the pointer is non-loop-consecutive then it is unsafe to vectorize it because we may hit an ordering issue. llvm-svn: 166371	2012-10-20 08:26:33 +00:00
Nadav Rotem	3940bafb54	Fix a typo llvm-svn: 166367	2012-10-20 05:03:27 +00:00
Nadav Rotem	f70ca3ceed	Vectorizer: refactor the memory checks to a new function. No functionality change. llvm-svn: 166366	2012-10-20 04:59:06 +00:00
Nadav Rotem	550f7f7e19	LoopVectorize: Keep the IRBuilder on the stack. llvm-svn: 166354	2012-10-19 23:27:19 +00:00
Nadav Rotem	4f7f72702b	Vectorizer: Add support for loop reductions. For example: for (i=0; i<n; i++) sum += A[i] + B[i] + i; llvm-svn: 166351	2012-10-19 23:05:40 +00:00
Benjamin Kramer	319cb771b2	LoopVectorize: Keep the IRBuilder on the stack. No functionality change. llvm-svn: 166274	2012-10-19 08:42:02 +00:00
Nadav Rotem	ced93f3a05	vectorizer: Add support for reading and writing from the same memory location. llvm-svn: 166255	2012-10-19 01:24:18 +00:00
Nadav Rotem	1667324f22	cleanup the comment. llvm-svn: 166247	2012-10-18 23:21:01 +00:00
Nadav Rotem	d45a6b93df	fix a naming typo llvm-svn: 166232	2012-10-18 21:45:31 +00:00
Nadav Rotem	f8a1396882	Avoid reconstructing the pointer set when searching for duplicated read/write pointers. llvm-svn: 166205	2012-10-18 18:34:50 +00:00
Nadav Rotem	a031c57417	When looking for a vector representation of a scalar, do a single lookup. Also, cache the result of the broadcast instruction. No functionality change. llvm-svn: 166191	2012-10-18 17:31:49 +00:00
Nadav Rotem	7a1728094c	remove unused variable to fix a warning. llvm-svn: 166170	2012-10-18 06:09:21 +00:00
Nadav Rotem	642efbcdd8	Remove the use of dominators and AA. llvm-svn: 166167	2012-10-18 05:33:02 +00:00
Nadav Rotem	b52f717411	Vectorizer: Add support for loops with an unknown count. For example: for (i=0; i<n; i++){ a[i] = b[i+1] + c[i+3]; } llvm-svn: 166165	2012-10-18 05:29:12 +00:00
NAKAMURA Takumi	7857415785	LoopVectorize.cpp: Fix a warning. [-Wunused-variable] llvm-svn: 166153	2012-10-17 23:40:15 +00:00
Jakub Staszak	68e5dfddcb	Remove redundant SetInsertPoint call. llvm-svn: 166138	2012-10-17 23:06:37 +00:00
Roman Divacky	4955ec317c	Fix some typos and wrong indenting. llvm-svn: 166128	2012-10-17 21:07:35 +00:00
Nadav Rotem	6b94c2a09b	Add a loop vectorizer. llvm-svn: 166112	2012-10-17 18:25:06 +00:00
Micah Villmow	cdfe20b97f	Move TargetData to DataLayout. llvm-svn: 165402	2012-10-08 16:38:25 +00:00
Sylvestre Ledru	91ce36c986	Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767 llvm-svn: 164768	2012-09-27 10:14:43 +00:00
Sylvestre Ledru	721cffd53a	Fix a typo 'iff' => 'if' llvm-svn: 164767	2012-09-27 09:59:43 +00:00
Benjamin Kramer	8bcc971174	Make MemoryBuiltins aware of TargetLibraryInfo. This disables malloc-specific optimization when -fno-builtin (or -ffreestanding) is specified. This has been a problem for a long time but became more severe with the recent memory builtin improvements. Since the memory builtin functions are used everywhere, this required passing TLI in many places. This means that functions that now have an optional TLI argument, like RecursivelyDeleteTriviallyDeadFunctions, won't remove dead mallocs anymore if the TLI argument is missing. I've updated most passes to do the right thing. Fixes PR13694 and probably others. llvm-svn: 162841	2012-08-29 15:32:21 +00:00
Hal Finkel	918ca2b8b7	Precompute SCEV pointer analysis prior to instruction fusion in BBVectorize. When both a load/store and its address computation are being vectorized, it can happen that the address-computation vectorization destroys SCEV's ability to analyize the relative pointer offsets. As a result (like with the aliasing analysis info), we need to precompute the necessary information prior to instruction fusing. This was found during stress testing (running through the test suite with a very low required chain length); unfortunately, I don't have a small test case. llvm-svn: 159332	2012-06-28 05:42:45 +00:00
Hal Finkel	0873d73cbf	Remove a useless check in BBVectorize. A shuffle mask will always be a constant, but I did not realize that when I originally wrote the code. llvm-svn: 159331	2012-06-28 05:42:43 +00:00
Hal Finkel	f2dcb9a9c4	Allow BBVectorize to form non-2^n-length vectors. The original algorithm only used recursive pair fusion of equal-length types. This is now extended to allow pairing of any types that share the same underlying scalar type. Because we would still generally prefer the 2^n-length types, those are formed first. Then a second set of iterations form the non-2^n-length types. Also, a call to SimplifyInstructionsInBlock has been added after each pairing iteration. This takes care of DCE (and a few other things) that make the following iterations execute somewhat faster. For the same reason, some of the simple shuffle-combination cases are now handled internally. There is some additional refactoring work to be done, but I've had many requests for this feature, so additional refactoring will come soon in future commits (as will additional test cases). llvm-svn: 159330	2012-06-28 05:42:42 +00:00
Hal Finkel	74e5225c92	Refactor operation equivalence checking in BBVectorize by extending Instruction::isSameOperationAs. Maintaining this kind of checking in different places is dangerous, extending Instruction::isSameOperationAs consolidates this logic into one place. Here I've added an optional flags parameter and two flags that are important for vectorization: CompareIgnoringAlignment and CompareUsingScalarTypes. llvm-svn: 159329	2012-06-28 05:42:26 +00:00
NAKAMURA Takumi	704de074b8	llvm/lib: [CMake] Add explicit dependency to intrinsics_gen. llvm-svn: 159112	2012-06-24 13:32:01 +00:00
Hal Finkel	3099ce9489	Allow controlling vectorization of boolean values separately from other integer types. These are used as the result of comparisons, and often handled differently from larger integer types. llvm-svn: 159111	2012-06-24 13:28:01 +00:00
Hal Finkel	4b06b1a0ee	Allow BBVectorize to fuse compare instructions. llvm-svn: 159088	2012-06-23 21:52:50 +00:00
Hal Finkel	fa103d3fc7	Teach BBVectorize to combine, when possible, or discard metadata when fusing instructions. The present implementation handles only TBAA and FP metadata, discarding everything else. For debug metadata, the current behavior is maintained (the debug metadata associated with one of the instructions will be kept, discarding that attached to the other). This should address PR 13040. llvm-svn: 158606	2012-06-16 20:34:06 +00:00
Hal Finkel	27c3246169	Don't vectorize target-specific types (ppc_fp128, x86_fp80, etc.). Target specific types should not be vectorized. As a practical matter, these types are already register matched (at least in the x86 case), and codegen does not always work correctly (at least in the ppc case, and this is not worth fixing because ppc_fp128 is currently broken and will probably go away soon). llvm-svn: 155729	2012-04-27 19:34:00 +00:00
Hal Finkel	52ba49f399	Fix style violation in BBVectorize (pointed out by Bill Wendling) llvm-svn: 154810	2012-04-16 12:39:17 +00:00
Hal Finkel	8ee309d9b7	Simplify checking for pointer types in BBVectorize (this change was suggested by Duncan). llvm-svn: 154787	2012-04-16 03:49:42 +00:00
Hal Finkel	83c9796033	Fix an error in BBVectorize important for vectorizing pointer types. When vectorizing pointer types it is important to realize that potential pairs cannot be connected via the address pointer argument of a load or store. This is because even after vectorization, the address is still a scalar because the address of the higher half of the pair is implicit from the address of the lower half (it need not be, and should not be, explicitly computed). llvm-svn: 154735	2012-04-14 07:32:50 +00:00
Hal Finkel	f589519a67	Enhance BBVectorize to more-properly handle pointer values and vectorize GEPs. llvm-svn: 154734	2012-04-14 07:32:43 +00:00
Hal Finkel	b2336a79f9	Add support to BBVectorize for vectorizing selects. llvm-svn: 154700	2012-04-13 20:45:45 +00:00
Hongbin Zheng	5758f495da	Refactor: Use positive field names in VectorizeConfig. llvm-svn: 154249	2012-04-07 03:56:23 +00:00
Hongbin Zheng	31d33b8318	BBVectorize: Add the const modifier to the VectorizeConfig because we won't modify it. llvm-svn: 154098	2012-04-05 16:07:49 +00:00
Hongbin Zheng	d6825173d3	Introduce the VectorizeConfig class, with which we can control the behavior of the BBVectorizePass without using command line option. As pointed out by Hal, we can ask the TargetLoweringInfo for the architecture specific VectorizeConfig to perform vectorizing with architecture specific information. llvm-svn: 154096	2012-04-05 15:46:55 +00:00
Hongbin Zheng	6edbc39bd7	Add the function "vectorizeBasicBlock" which allow users vectorize a BasicBlock in other passes, e.g. we can call vectorizeBasicBlock in the loop unroll pass right after the loop is unrolled. llvm-svn: 154089	2012-04-05 08:05:16 +00:00
Hal Finkel	5cad8742cc	Correctly vectorize powi. The powi intrinsic requires special handling because it always takes a single integer power regardless of the result type. As a result, we can vectorize only if the powers are equal. Fixes PR12364. llvm-svn: 153797	2012-03-31 03:38:40 +00:00
Sebastian Pop	5ce71b18cb	fix typos llvm-svn: 152035	2012-03-05 17:39:47 +00:00
Sebastian Pop	8844e224b8	remove spaces on empty lines llvm-svn: 152034	2012-03-05 17:39:45 +00:00
Hal Finkel	1bde3f86d1	Update BBVectorize to use aliasesUnknownInst. This allows BBVectorize to check the "unknown instruction" list in the alias sets. This is important to prevent instruction fusing from reordering function calls. Resolves PR11920. llvm-svn: 150250	2012-02-10 15:52:40 +00:00
Sebastian Pop	662beed828	fix indentation llvm-svn: 149857	2012-02-06 05:29:32 +00:00
David Blaikie	f9c1291fde	Simplify contains tests using 'count'. llvm-svn: 149813	2012-02-05 06:35:36 +00:00
NAKAMURA Takumi	32c48634db	BBVectorize.cpp: Get rid of comparision to bool to fix a warning. llvm-svn: 149810	2012-02-05 05:47:51 +00:00
Hal Finkel	135cac922c	Boost the effective chain depth of loads and stores. By default, boost the chain depth contribution of loads and stores. This will allow a load/store pair to vectorize even when it would not otherwise be long enough to satisfy the chain depth requirement. llvm-svn: 149761	2012-02-04 04:14:04 +00:00
Benjamin Kramer	f61f60d97a	BBVectorize: Simplify code, no functionality change. Also silences warnings about bodyless for loops. llvm-svn: 149612	2012-02-02 18:52:15 +00:00
Hal Finkel	8cf51b871c	Minor changes from review. As suggested by Nick Lewycky, the tree traversal queues have been changed to SmallVectors and the associated loops have been rotated. Also, an 80-col violation was fixed. llvm-svn: 149607	2012-02-02 17:29:39 +00:00
Hal Finkel	0f3298e8d4	Vectorize long blocks in groups. Long basic blocks with many candidate pairs (such as in the SHA implementation in Perl 5.14; thanks to Roman Divacky for the example) used to take an unacceptably-long time to compile. Instead, break long blocks into groups so that no group has too many candidate pairs. llvm-svn: 149595	2012-02-02 06:14:56 +00:00
NAKAMURA Takumi	e1d61f666b	BBVectorize.cpp: Try to fix MSVC build. map::iterator and multimap::iterator are incompatible. llvm-svn: 149475	2012-02-01 06:11:58 +00:00
Hal Finkel	8a3aebe5e0	A few of the changes suggested in code review (by Nick Lewycky) llvm-svn: 149472	2012-02-01 05:51:45 +00:00
Hal Finkel	c34e51132c	Add a basic-block autovectorization pass. This is the initial checkin of the basic-block autovectorization pass along with some supporting vectorization infrastructure. Special thanks to everyone who helped review this code over the last several months (especially Tobias Grosser). llvm-svn: 149468	2012-02-01 03:51:43 +00:00

... 11 12 13 14 15 ...

819 Commits