llvm-project

Commit Graph

Author	SHA1	Message	Date
Adhemerval Zanella	ef206f19a4	PowerPC: add EmitTCEntry class for TOC creation This patch replaces the EmitRawText by a EmitTCEntry class (specialized for each Streamer) in PowerPC64 TOC entry creation. llvm-svn: 165940	2012-10-15 15:43:14 +00:00
Kostya Serebryany	b0e2506d97	[asan] make AddressSanitizer to be a FunctionPass instead of ModulePass. This will simplify chaining other FunctionPasses with asan. Also some minor cleanup llvm-svn: 165936	2012-10-15 14:20:06 +00:00
Chandler Carruth	49c8eea3c0	Update the memcpy rewriting to fully support widened int rewriting. This includes extracting ints for copying elsewhere and inserting ints when copying into the alloca. This should fix the CanSROA assertion coming out of Clang's regression test suite. llvm-svn: 165931	2012-10-15 10:24:43 +00:00
Chandler Carruth	9d966a2002	Follow-up fix to r165928: handle memset rewriting for widened integers, and generally clean up the memset handling. It had rotted a bit as the other rewriting logic got polished more. llvm-svn: 165930	2012-10-15 10:24:40 +00:00
Silviu Baranga	b14097000b	Fixed PR13938: the ARM backend was crashing because it couldn't select a VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE. llvm-svn: 165929	2012-10-15 09:41:32 +00:00
Chandler Carruth	435c4e0792	First major step toward addressing PR14059. This teaches SROA to handle cases where we have partial integer loads and stores to an otherwise promotable alloca to widen[1] those loads and stores to cover the entire alloca and bitcast them into the appropriate type such that promotion can proceed. These partial loads and stores stem from an annoying confluence of ARM's calling convention and ABI lowering and the FCA pre-splitting which takes place in SROA. Clang lowers a { double, double } in-register function argument as a [4 x i32] function argument to ensure it is placed into integer 32-bit registers (a really unnerving implicit contract between Clang and the ARM backend I would add). This results in a FCA load of [4 x i32]* from the { double, double } alloca, and SROA decomposes this into a sequence of i32 loads and stores. Inlining proceeds, code gets folded, but at the end of the day, we still have i32 stores to the low and high halves of a double alloca. Widening these to be i64 operations, and bitcasting them to double prior to loading or storing allows promotion to proceed for these allocas. I looked quite a bit changing the IR which Clang produces for this case to be more friendly, but small changes seem unlikely to help. I think the best representation we could use currently would be to pass 4 i32 arguments thereby avoiding any FCAs, but that would still require this fix. It seems like it might eventually be nice to somehow encode the ABI register selection choices outside of the parameter type system so that the parameter can be a { double, double }, but the CC register annotations indicate that this should be passed via 4 integer registers. This patch does not address the second problem in PR14059, which is the reverse: when a struct alloca is loaded as a larger single integer. This patch also does not address some of the code quality issues with the FCA-splitting. Those don't actually impede any optimizations really, but they're on my list to clean up. [1]: Pedantic footnote: for those concerned about memory model issues here, this is safe. For the alloca to be promotable, it cannot escape or have any use of its address that could allow these loads or stores to be racing. Thus, widening is always safe. llvm-svn: 165928	2012-10-15 08:40:30 +00:00
Chandler Carruth	aa6afbb831	Hoist the canConvertValue predicate and the convertValue transform out into static helper functions. They're really quite generic and are going to be needed elsewhere shortly. llvm-svn: 165927	2012-10-15 08:40:22 +00:00
Bill Wendling	fbd38fe2e3	Add an enum for the return and function indexes into the AttrListPtr object. This gets rid of some magic numbers. llvm-svn: 165924	2012-10-15 07:29:08 +00:00
Bill Wendling	79d45dbbf9	Use a ::get method to create the attribute from Attributes::AttrVals instead of a constructor. llvm-svn: 165923	2012-10-15 06:53:28 +00:00
Bill Wendling	8c3e65db52	Move the AttributesImpl header file into the VMCore directory so that it can be opaque. llvm-svn: 165920	2012-10-15 05:40:12 +00:00
Bill Wendling	d079a446d7	Attributes Rewrite Convert the internal representation of the Attributes class into a pointer to an opaque object that's uniqued by and stored in the LLVMContext object. The Attributes class then becomes a thin wrapper around this opaque object. Eventually, the internal representation will be expanded to include attributes that represent code generation options, etc. llvm-svn: 165917	2012-10-15 04:46:55 +00:00
Meador Inge	40b6fac36c	instcombine: Migrate strcmp and strncmp optimizations This patch migrates the strcmp and strncmp optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 165915	2012-10-15 03:47:37 +00:00
Benjamin Kramer	c5b0678cf8	Simplify code. No functionality change. llvm-svn: 165904	2012-10-14 11:15:42 +00:00
Benjamin Kramer	650b1dbd56	Unquadratize SetVector removal loops in DSE. Erasing from the beginning or middle of the vector is expensive, remove_if can do it in linear time even though it's a bit ugly without lambdas. No functionality change. llvm-svn: 165903	2012-10-14 10:21:31 +00:00
Bill Wendling	5ae76a379c	Remove dead methods. llvm-svn: 165902	2012-10-14 09:21:44 +00:00
Bill Wendling	76d2cd2f60	Remove operator cast method in favor of querying with the correct method. llvm-svn: 165899	2012-10-14 08:54:26 +00:00
Benjamin Kramer	6bbdf70818	Fix use after free when deleting attributes in a chained folding set. Can't follow the intrusive linked list when the element is gone. llvm-svn: 165898	2012-10-14 08:48:40 +00:00
Bill Wendling	a0f3e8d1cb	Don't use the new syntax just yet. llvm-svn: 165897	2012-10-14 08:25:35 +00:00
Bill Wendling	2a3c1cca7d	Remove the bitwise AND operators from the Attributes class. Replace it with the equivalent from the builder class. llvm-svn: 165896	2012-10-14 07:52:48 +00:00
Bill Wendling	722b26c0f2	Remove the bitwise assignment OR operator from the Attributes class. Replace it with the equivalent from the builder class. llvm-svn: 165895	2012-10-14 07:35:59 +00:00
Bill Wendling	5c407ed3ab	Remove the bitwise OR operator from the Attributes class. Replace it with the equivalent from the builder class. llvm-svn: 165894	2012-10-14 07:17:34 +00:00
Bill Wendling	a05b043c4a	Remove the bitwise XOR operator from the Attributes class. Replace it with the equivalent from the builder class. llvm-svn: 165893	2012-10-14 06:56:13 +00:00
Bill Wendling	85a64c217f	Remove the bitwise NOT operator from the Attributes class. Replace it with the equivalent from the builder class. llvm-svn: 165892	2012-10-14 06:39:53 +00:00
Bill Wendling	1fcc82225a	Decode the LLVM attributes from bitcode using the attributes builder. llvm-svn: 165891	2012-10-14 04:10:01 +00:00
Bill Wendling	abd5ba2523	Use builder to create alignment attributes. Remove dead function. llvm-svn: 165890	2012-10-14 03:58:29 +00:00
Bill Wendling	9e1eb4d1b9	Don't pass in an Attributes object to something that expects an integral value. llvm-svn: 165887	2012-10-14 03:27:15 +00:00
Benjamin Kramer	44e58f9eb1	Remove unused private field. llvm-svn: 165881	2012-10-13 18:03:34 +00:00
Benjamin Kramer	35480284e7	X86: Disable long nops for all cpus prior to pentiumpro/i686. llvm-svn: 165878	2012-10-13 17:28:35 +00:00
Jakob Stoklund Olesen	ea82bd7f0d	Drop <def,dead> flags when merging into an unused lane. The new coalescer can merge a dead def into an unused lane of an otherwise live vector register. Clear the <dead> flag when that happens since the flag refers to the full virtual register which is still live after the partial dead def. This fixes PR14079. llvm-svn: 165877	2012-10-13 17:26:47 +00:00
Meador Inge	174185084c	instcombine: Migrate strchr and strrchr optimizations This patch migrates the strchr and strrchr optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 165875	2012-10-13 16:45:37 +00:00
Meador Inge	7fb2f7378b	instcombine: Migrate strcat and strncat optimizations This patch migrates the strcat and strncat optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 165874	2012-10-13 16:45:32 +00:00
Meador Inge	df796f893f	Implement new LibCallSimplifier class This patch implements the new LibCallSimplifier class as outlined in [1]. In addition to providing the new base library simplification infrastructure, all the fortified library call simplifications were moved over to the new infrastructure. The rest of the library simplification optimizations will be moved over with follow up patches. NOTE: The original fortified library call simplifier located in the SimplifyFortifiedLibCalls class was not removed because it is still used by CodeGenPrepare. This class will eventually go away too. [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-August/052283.html llvm-svn: 165873	2012-10-13 16:45:24 +00:00
Jakob Stoklund Olesen	2f6dfc7d0b	Allow for loops in LiveIntervals::pruneValue(). It is possible that the live range of the value being pruned loops back into the kill MBB where the search started. When that happens, make sure that the beginning of KillMBB is also pruned. Instead of starting a DFS at KillMBB and skipping the root of the search, start a DFS at each KillMBB successor, and allow the search to loop back to KillMBB. This fixes PR14078. llvm-svn: 165872	2012-10-13 16:15:31 +00:00
Benjamin Kramer	ecd15d7f6c	X86: Fix accidentally swapped operands. llvm-svn: 165871	2012-10-13 12:50:19 +00:00
Chandler Carruth	ba9319925e	Teach SROA to cope with wrapper aggregates. These show up a lot in ABI type coercion code, especially when targetting ARM. Things like [1 x i32] instead of i32 are very common there. The goal of this logic is to ensure that when we are picking an alloca type, we look through such wrapper aggregates and across any zero-length aggregate elements to find the simplest type possible to form a type partition. This logic should (generally speaking) rarely fire. It only ends up kicking in when an alloca is accessed using two different types (for instance, i32 and float), and the underlying alloca type has wrapper aggregates around it. I noticed a significant amount of this occurring looking at stepanov_abstraction generated code for arm, and suspect it happens elsewhere as well. Note that this doesn't yet address truly heinous IR productions such as PR14059 is concerning. Those result in mismatched sizes of types in addition to mismatched access and alloca types. llvm-svn: 165870	2012-10-13 10:49:33 +00:00
Chandler Carruth	482c61787c	Speculatively harden the conversion logic. I have no idea if this will help the dragonegg builders, and no test case at this point, but this was one dimly plausible case I spotted by inspection. Hopefully will get a testcase from those bots soon-ish, and will tidy this up with proper testing. llvm-svn: 165869	2012-10-13 10:49:30 +00:00
Benjamin Kramer	d6b9362fc2	X86: Promote i8 cmov when both operands are coming from truncates of the same width. X86 doesn't have i8 cmovs so isel would emit a branch. Emitting branches at this level is often not a good idea because it's too late for many optimizations to kick in. This solution doesn't add any extensions (truncs are free) and tries to avoid introducing partial register stalls by filtering direct copyfromregs. I'm seeing a ~10% speedup on reading a random .png file with libpng15 via graphicsmagick on x86_64/westmere, but YMMV depending on the microarchitecture. llvm-svn: 165868	2012-10-13 10:39:49 +00:00
Chandler Carruth	0fb8a7787e	Silence a warning in -assert builds. llvm-svn: 165867	2012-10-13 05:09:27 +00:00
Chandler Carruth	891fec0b56	Clean up how we rewrite loads and stores to the whole alloca. When these are single value types, the load and store should be directly based upon the alloca and then bitcasting can fix the type as needed afterward. This might in theory improve some of the IR coming out of SROA, but I don't expect big changes yet and don't have any test cases on hand. This is really just a cleanup/refactoring patch. The next patch will cause this code path to be hit a lot more, actually get SROA to promote more allocas and include several more test cases. llvm-svn: 165864	2012-10-13 02:41:05 +00:00
Chad Rosier	4996355592	[ms-inline asm] Remove the MatchInstruction() function. Previously, this was the interface between the front-end and the MC layer when parsing inline assembly. Unfortunately, this is too deep into the parsing stack. Specifically, we're unable to handle target-independent assembly (i.e., assembly directives, labels, etc.). Note the MatchAndEmitInstruction() isn't the correct abstraction either. I'll be exposing target-independent hooks shortly, so this is really just a cleanup. llvm-svn: 165858	2012-10-13 00:26:04 +00:00
Andrew Kaylor	4732872bd2	Check section type rather than assuming it's code when emitting sections while processing relocations. llvm-svn: 165854	2012-10-12 23:53:16 +00:00
Manman Ren	7e48b252e7	ARM: tail-call inside a function where part of a byval argument is on caller's local frame causes problem. For example: void f(StructToPass s) { g(&s, sizeof(s)); } will cause problem with tail-call since part of s is passed via registers and saved in f's local frame. When g tries to access s, part of s may be corrupted since f's local frame is popped out before the tail-call. The current fix is to disable tail-call if getVarArgsRegSaveSize is not 0 for the caller. This is a conservative approach, if we can prove the address of s or part of s is not taken and passed to g, it should be okay to perform tail-call. rdar://12442472 llvm-svn: 165853	2012-10-12 23:39:43 +00:00
Chad Rosier	4453e8453e	[ms-inline asm] Capitalize per coding standard. llvm-svn: 165847	2012-10-12 23:09:25 +00:00
Jim Grosbach	30af442a84	ARM: Mark VSELECT as 'expand'. The backend already pattern matches to form VBSL when it can. We may want to teach it to use the vbsl intrinsics at some point to prevent machine licm from mucking with this, but using the Expand is completely correct. http://llvm.org/bugs/show_bug.cgi?id=13831 http://llvm.org/bugs/show_bug.cgi?id=13961 Patch by Peter Couperus <peter.couperus@st.com>. llvm-svn: 165845	2012-10-12 22:59:21 +00:00
Chad Rosier	2f480a8a50	[ms-inline asm] Use the new API introduced in r165830 in lieu of the MapAndConstraints vector. Also remove the unused Kind argument. llvm-svn: 165833	2012-10-12 22:53:36 +00:00
Jakob Stoklund Olesen	1a87a29d08	Use a transposed algorithm for handleMove(). Completely update one interval at a time instead of collecting live range fragments to be updated. This avoids building data structures, except for a single SmallPtrSet of updated intervals. Also share code between handleMove() and handleMoveIntoBundle(). Add support for moving dead defs across other live values in the interval. The MI scheduler can do that. llvm-svn: 165824	2012-10-12 21:31:57 +00:00
Jakob Stoklund Olesen	1a3eb878f6	Fix coalescing with IMPLICIT_DEF values. PHIElimination inserts IMPLICIT_DEF instructions to guarantee that all PHI predecessors have a live-out value. These IMPLICIT_DEF values are not considered to be real interference when coalescing virtual registers: %vreg1 = IMPLICIT_DEF %vreg2 = MOV32r0 When joining %vreg1 and %vreg2, the IMPLICIT_DEF instruction and its value number should simply be erased since the %vreg2 value number now provides a live-out value for the PHI predecesor block. llvm-svn: 165813	2012-10-12 18:03:04 +00:00
Ulrich Weigand	9aa51d1a2c	Fix big-endian codegen bug in DAGTypeLegalizer::ExpandRes_BITCAST On PowerPC, a bitcast of <16 x i8> to i128 may run through a code path in ExpandRes_BITCAST that attempts to do an intermediate bitcast to a <4 x i32> vector, and then construct the Hi and Lo parts of the resulting i128 by pairing up two of those i32 vector elements each. The code already recognizes that on a big-endian system, the first two vector elements form the Hi part, and the final two vector elements form the Lo part (vice-versa from the little-endian situation). However, we also need to take endianness into account when forming each of those separate pairs: on a big-endian system, vector element 0 is the high part of the pair making up the Hi part of the result, and vector element 1 is the low part of the pair. The code currently always uses vector element 0 as the low part and vector element 1 as the high part, as is appropriate for little-endian platforms only. This patch fixes this by swapping the vector elements as they are paired up as appropriate. llvm-svn: 165802	2012-10-12 15:42:58 +00:00
Duncan Sands	d5772de0eb	Add powerpc-ibm-aix to Triple. Patch by Kai. llvm-svn: 165792	2012-10-12 11:08:57 +00:00
Eric Christopher	ca2ff70eb8	Indenting. llvm-svn: 165785	2012-10-12 02:04:47 +00:00

1 2 3 4 5 ...

56768 Commits