llvm-project

Commit Graph

Author	SHA1	Message	Date
Daniel Dunbar	0ec72bbc4d	[Linker] Change module flag linking to be more extensible. - Instead of computing a bunch of buckets of different flag types, just do an incremental link resolving conflicts as they arise. - This also has the advantage of making the link result deterministic and not dependent on map iteration order. llvm-svn: 172634	2013-01-16 18:39:23 +00:00
Kevin Enderby	e82ada6983	We want the dwarf AT_producer for assembly source files to match clang's AT_producer. Which includes clang's version information so we can tell which version of the compiler was used. This is the first of two steps to allow us to do that. This is the llvm-mc change to provide a method to set the AT_producer string. The second step, coming soon to a clang near you, will have the clang driver pass the value of getClangFullVersion() via an flag when invoking the integrated assembler on assembly source files. rdar://12955296 llvm-svn: 172630	2013-01-16 17:46:23 +00:00
Peter Collingbourne	a51c6ed608	Introduce llvm::sys::getProcessTriple() function. In r143502, we renamed getHostTriple() to getDefaultTargetTriple() as part of work to allow the user to supply a different default target triple at configure time. This change also affected the JIT. However, it is inappropriate to use the default target triple in the JIT in most circumstances because this will not necessarily match the current architecture used by the process, leading to illegal instruction and other such errors at run time. Introduce the getProcessTriple() function for use in the JIT and its clients, and cause the JIT to use it. On architectures with a single bitness, the host and process triples are identical. On other architectures, the host triple represents the architecture of the host CPU, while the process triple represents the architecture used by the host CPU to interpret machine code within the current process. For example, when executing 32-bit code on a 64-bit Linux machine, the host triple may be 'x86_64-unknown-linux-gnu', while the process triple may be 'i386-unknown-linux-gnu'. This fixes JIT for the 32-on-64-bit (and vice versa) build on non-Apple platforms. Differential Revision: http://llvm-reviews.chandlerc.com/D254 llvm-svn: 172627	2013-01-16 17:27:22 +00:00
Benjamin Kramer	b7050f0a7c	Move test that depends on the x86 target into a target-specific directory. Should fix the arm buildbot (which only builds the arm target). llvm-svn: 172611	2013-01-16 13:25:56 +00:00
Alexey Samsonov	1345d35e40	ASan: wrap mapping scale and offset in a struct and make it a member of ASan passes. Add test for non-default mapping scale and offset. No functionality change llvm-svn: 172610	2013-01-16 13:23:28 +00:00
Benjamin Kramer	1f25d24a8f	Remove triple from this test, it makes it fail when X86 TTI is missing. Without a triple opt falls back to NoTTI which comes closer to LSR's pre-TTI behavior. llvm-svn: 172609	2013-01-16 13:19:59 +00:00
Jack Carter	5619f91bf7	reverting 172579 llvm-svn: 172594	2013-01-16 01:29:10 +00:00
Jack Carter	e0c1e1a47e	Akira, Hope you are feeling better. The Mips RDHWR (Read Hardware Register) instruction was not tested for assembler or dissassembler consumption. This patch adds that functionality. Contributer: Vladimir Medic llvm-svn: 172579	2013-01-16 00:07:45 +00:00
Eric Christopher	962c9089d9	Split address information for DWARF5 split dwarf proposal. This involves using the DW_FORM_GNU_addr_index and a separate .debug_addr section which stays in the executable and is fully linked. Sneak in two other small changes: a) Print out the debug_str_offsets.dwo section. b) Change form we're expecting the entries in the debug_str_offsets.dwo section to take from ULEB128 to U32. Add tests for all of this in the fission-cu.ll test. llvm-svn: 172578	2013-01-15 23:56:56 +00:00
Nadav Rotem	7df850924d	Teach InstCombine to optimize extract of a value from a vector add operation with a constant zero. llvm-svn: 172576	2013-01-15 23:43:14 +00:00
Shuxin Yang	e822745202	1. Hoist minus sign as high as possible in an attempt to reveal some optimization opportunities (in the enclosing supper-expressions). rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y) if expression "-0.0 - X" has only one reference. rule 2. (0.0 - X ) * Y => -0.0 - (X * Y) if expression "0.0 - X" has only one reference, and the instruction is marked "noSignedZero". 2. Eliminate negation (The compiler was already able to handle these opt if the 0.0s are replaced with -0.0.) rule 3: (0.0 - X) * (0.0 - Y) => X * Y rule 4: (0.0 - X) * C => X * -C if the expr is flagged "noSignedZero". 3. Rule 5: (XY) X => (XX) Y if X!=Y and the expression is flagged with "UnsafeAlgebra". The purpose of this transformation is two-fold: a) to form a power expression (of X). b) potentially shorten the critical path: After transformation, the latency of the instruction Y is amortized by the expression of XX, and therefore Y is in a "less critical" position compared to what it was before the transformation. 4. Remove the InstCombine code about simplifiying "X select". The reasons are following: a) The "select" is somewhat architecture-dependent, therefore the higher level optimizers are not able to precisely predict if the simplification really yields any performance improvement or not. b) The "select" operator is bit complicate, and tends to obscure optimization opportunities. It is btter to keep it as low as possible in expr tree, and let CodeGen to tackle the optimization. llvm-svn: 172551	2013-01-15 21:09:32 +00:00
Daniel Dunbar	c36547d422	[IR] Add verification for module flags with the "require" behavior. llvm-svn: 172549	2013-01-15 20:52:06 +00:00
Evgeniy Stepanov	701d2b861e	[msan] Temporarily remove ICmpEQ tests. They are failing on the bots. llvm-svn: 172540	2013-01-15 17:12:04 +00:00
Evgeniy Stepanov	d14e47b146	[msan] Fix handling of equality comparison of pointer vectors. Also improve test coveration of the handling of relational comparisons. llvm-svn: 172539	2013-01-15 16:44:52 +00:00
Renato Golin	51c25b0818	Pattern-matched variables in post-inc-icmpzero.ll Test was failing for clang-native-arm-cortex-a9 build-bot configuration. The reason for the failure was the test was using hardcoded names. The attached patch fixes this failure by replacing the hard-coded variables names with pattern-matched variable names. Patch by Manish Verma, ARM llvm-svn: 172534	2013-01-15 15:22:45 +00:00
Daniel Dunbar	25c4b5718b	[IR] Add verifier support for llvm.module.flags. - Also, update the LangRef documentation on module flags to match the implementation. llvm-svn: 172498	2013-01-15 01:22:53 +00:00
Jack Carter	f238510c43	This patch fixes a Mips specific bug where we need to generate a N64 compound relocation R_MIPS_GPREL_32/R_MIPS_64/R_MIPS_NONE. The bug was exposed by the SingleSourcetest case DuffsDevice.c. Contributer: Jack Carter llvm-svn: 172496	2013-01-15 01:08:02 +00:00
Shuxin Yang	320f52a4b0	This change is to implement following rules under the condition C_A and/or C_R --------------------------------------------------------------------------- C_A: reassociation is allowed C_R: reciprocal of a constant C is appropriate, which means - 1/C is exact, or - reciprocal is allowed and 1/C is neither a special value nor a denormal. ----------------------------------------------------------------------------- rule1: (X/C1) / C2 => X / (C2C1) (if C_A) => X (1/(C2C1)) (if C_A && C_R) rule 2: XC1 / C2 => X * (C1/C2) if C_A rule 3: (X/Y)/Z = > X/(YZ) (if C_A && at least one of Y and Z is symbolic value) rule 4: Z/(X/Y) = > (ZY)/X (similar to rule3) rule 5: C1/(XC2) => (C1/C2) / X (if C_A) rule 6: C1/(X/C2) => (C1C2) / X (if C_A) rule 7: C1/(C2/X) => (C1/C2) * X (if C_A) llvm-svn: 172488	2013-01-14 22:48:41 +00:00
Chad Rosier	5c118fd2ec	[ms-inline asm] Extend support for parsing Intel bracketed memory operands that have an arbitrary ordering of the base register, index register and displacement. rdar://12527141 llvm-svn: 172484	2013-01-14 22:31:35 +00:00
Bill Schmidt	d006c6938b	This patch addresses an incorrect transformation in the DAG combiner. The included test case is derived from one of the GCC compatibility tests. The problem arises after the selection DAG has been converted to type-legalized form. The combiner first sees a 64-bit load that can be converted into a pre-increment form. The original load feeds into a SRL that isolates the upper 32 bits of the loaded doubleword. This looks like an opportunity for DAGCombiner::ReduceLoadWidth() to replace the 64-bit load with a 32-bit load. However, this transformation is not valid, as the replacement load is not a pre-increment load. The pre-increment load produces an extra result, which feeds a subsequent add instruction. The replacement load only has one result value, and this value is propagated to all uses of the pre- increment load, including the add. Because the add is looking for the second result value as its operand, it ends up attempting to add a constant to a token chain, resulting in a crash. So the patch simply disables this transformation for any load with more than two result values. llvm-svn: 172480	2013-01-14 22:04:38 +00:00
Andrew Trick	d4e1b5e291	SCEVExpander fix. RAUW needs to update the InsertedExpressions cache. Note that this bug is only exposed because LTO fails to use TTI. Fixes self-LTO of clang. rdar://13007381. llvm-svn: 172462	2013-01-14 21:00:37 +00:00
Michael Gottesman	c99ee6b336	Added bugzilla PR number to test case. llvm-svn: 172369	2013-01-13 22:17:22 +00:00
Michael Gottesman	f15c0bb495	Fixed an infinite loop in the block escape in analysis in ObjCARC caused by 2x blocks each assigned a value via a phi-node causing each to depend on the other. A test case is provided as well. llvm-svn: 172368	2013-01-13 22:12:06 +00:00
Benjamin Kramer	bcd14a0f26	X86: Add patterns for X86ISD::VSEXT in registers. Those can occur when something between the sextload and the store is on the same chain and blocks isel. Fixes PR14887. llvm-svn: 172353	2013-01-13 11:37:04 +00:00
Nadav Rotem	40e45eeae2	Fix PR14547. Handle induction variables of small sizes smaller than i32 (i8 and i16). llvm-svn: 172348	2013-01-13 07:56:29 +00:00
Benjamin Kramer	5ea0349ef5	When lowering an inreg sext first shift left, then right arithmetically. Shifting right two times will only yield zero. Should fix SingleSource/UnitTests/SignlessTypes/factor. llvm-svn: 172322	2013-01-12 19:06:44 +00:00
Michael Gottesman	556ff61122	Fixed bug in ObjCARC where we were changing a call from objc_autoreleaseRV => objc_autorelease but were not updating the InstructionClass to IC_Autorelease. llvm-svn: 172288	2013-01-12 01:25:19 +00:00
Michael Gottesman	c9656faf1e	Fixed a bug where we were tail calling objc_autorelease causing an object to not be placed into an autorelease pool. The reason that this occurs is that tail calling objc_autorelease eventually tail calls -[NSObject autorelease] which supports fast autorelease. This can cause us to violate the semantic gaurantees of __autoreleasing variables that assignment to an __autoreleasing variables always yields an object that is placed into the innermost autorelease pool. The fix included in this patch works by: 1. In the peephole optimization function OptimizeIndividualFunctions, always remove tail call from objc_autorelease. 2. Whenever we convert to/from an objc_autorelease, set/unset the tail call keyword as appropriate. NOTE I also handled the case where objc_autorelease is converted in OptimizeReturns to an autoreleaseRV which still violates the ARC semantics. I will be removing that in a later patch and I wanted to make sure that the tree is in a consistent state vis-a-vis ARC always. Additionally some test cases are provided and all tests that have tail call marked objc_autorelease keywords have been modified so that tail call has been removed. NOTE One test fails due to a separate bug that I am going to commit soon. Thus I marked the check line TMP: instead of CHECK: so make check does not fail. llvm-svn: 172287	2013-01-12 01:25:15 +00:00
Jack Carter	873c724b4a	This patch tackles the problem of parsing Mips register names in the standalone assembler llvm-mc. Registers such as $A1 can represent either a 32 or 64 bit register based on the instruction using it. In addition, based on the abi, $T0 can represent different 32 bit registers. The problem is resolved by the Mips specific AsmParser td definitions changing to work together. Many cases of RegisterClass parameters are now RegisterOperand. Contributer: Vladimir Medic llvm-svn: 172284	2013-01-12 01:03:14 +00:00
Nadav Rotem	dbe5c72d03	PPC: Implement efficient lowering of sign_extend_inreg. llvm-svn: 172269	2013-01-11 22:57:48 +00:00
Preston Gurd	99c6990457	Update patch for the pad short functions pass for Intel Atom (only). Adds a check for -Oz, changes the code to not re-visit BBs, and skips over DBG_VALUE instrs. Patch by Andy Zhang. llvm-svn: 172258	2013-01-11 22:06:56 +00:00
Nadav Rotem	e55aa3c848	ARM Cost Model: Modify the target independent cost model to ask the target if it supports the different CAST types. We didn't do this on X86 because of the different register sizes and types, but on ARM this makes sense. llvm-svn: 172245	2013-01-11 19:54:13 +00:00
Eric Christopher	0cb6fd930e	For inline asm: - recognize string "{memory}" in the MI generation - mark as mayload/maystore when there's a memory clobber constraint. PR14859. Patch by Krzysztof Parzyszek llvm-svn: 172228	2013-01-11 18:12:39 +00:00
Tim Northover	3a51aab390	Simplify writing floating types to assembly. This removes previous special cases for each floating-point type in favour of a shared codepath. llvm-svn: 172189	2013-01-11 10:36:13 +00:00
Nadav Rotem	853fe0acb9	ARM Cost Model: We need to detect the max bitwidth of types in the loop in order to select the max vectorization factor. We don't have a detailed analysis on which values are vectorized and which stay scalars in the vectorized loop so we use another method. We look at reduction variables, loads and stores, which are the only ways to get information in and out of loop iterations. If the data types are extended and truncated then the cost model will catch the cost of the vector zext/sext/trunc operations. llvm-svn: 172178	2013-01-11 07:11:59 +00:00
Michael Gottesman	5284bd013f	Converted test dont-tce-tail-marked-call.ll to use FileCheck. llvm-svn: 172172	2013-01-11 04:16:35 +00:00
Michael Gottesman	93a0d49c7e	This commit is a 4x squash commit consisting of 4x functions converted to use FileCheck instead of grep. Messages: Converted test case trivial_codegen_tailcall.ll to use FileCheck. Converted test return_constant.ll to use FileCheck instead of grep. Converted test reorder_load.ll to use FileCheck instead of grep. Converted test intervening-inst.ll to use FileCheck instead of grep. llvm-svn: 172171	2013-01-11 04:12:53 +00:00
Shuxin Yang	c5c730b0e0	PR14904: Segmentation fault running pass 'Recognize loop idioms' The root cause is mistakenly taking for granted that "dyn_cast<Instruction>(a-Value)" return a non-NULL instruction. llvm-svn: 172145	2013-01-10 23:32:01 +00:00
Evan Cheng	098d7b76b0	CastInst::castIsValid should return true if the dest type is the same as Value's current type. The casting is trivial even for aggregate type. llvm-svn: 172143	2013-01-10 23:22:53 +00:00
NAKAMURA Takumi	e46e8225f4	llvm/test/CodeGen/X86/ms-inline-asm.ll: Fixup; Globals doesn't have leading underscore in symbol on linux. llvm-svn: 172139	2013-01-10 23:02:48 +00:00
Michael J. Spencer	d857c1c9bf	[llvm-objdump] Emit addresses with the correct number of leading 0's. llvm-svn: 172130	2013-01-10 22:40:50 +00:00
Peter Collingbourne	f7d65c43d0	[msan] Change va_start/va_copy shadow memset alignment to 8. This fixes va_start/va_copy of a va_list field which happens to not be laid out at a 16-byte boundary. Differential Revision: http://llvm-reviews.chandlerc.com/D276 llvm-svn: 172128	2013-01-10 22:36:33 +00:00
Evan Cheng	c8444b159a	PR14896: Handle memcpy from constant string where the memcpy size is larger than the string size. llvm-svn: 172124	2013-01-10 22:13:27 +00:00
Chad Rosier	a4bc9437a2	[ms-inline asm] Add support for calling functions from inline assembly. Part of rdar://12991541 llvm-svn: 172121	2013-01-10 22:10:27 +00:00
Owen Anderson	dbf0ca523d	Teach InstCombine to hoist FABS and FNEG through FPTRUNC instructions. The application of these operations commutes with the truncation, so we should prefer to do them in the smallest size we can, to save register space, use smaller constant pool entries, etc. llvm-svn: 172117	2013-01-10 22:06:52 +00:00
Nadav Rotem	6eae65cfac	LoopVectorizer: Fix a bug in the vectorization of BinaryOperators. The BinaryOperator can be folded to an Undef, and we don't want to set NSW flags to undef vals. PR14878 llvm-svn: 172079	2013-01-10 17:34:39 +00:00
Joey Gouly	5fad3e9ad6	Fix a copy/paste error in the IR Linker, casting an ArrayType instead of a VectorType. llvm-svn: 172054	2013-01-10 10:49:36 +00:00
Joey Gouly	58bf951dec	Fix TryToShrinkGlobalToBoolean in GlobalOpt, so that it does not discard address spaces. llvm-svn: 172051	2013-01-10 10:31:11 +00:00
Manman Ren	207bcbacca	Stack Alignment: throw error if we can't satisfy the minimal alignment requirement when creating stack objects in MachineFrameInfo. Add CreateStackObjectWithMinAlign to throw error when the minimal alignment can't be achieved and to clamp the alignment when the preferred alignment can't be achieved. Same is true for CreateVariableSizedObject. Will not emit error in CreateSpillStackObject or CreateStackObject. As long as callers of CreateStackObject do not assume the object will be aligned at the requested alignment, we should not have miscompile since later optimizations which look at the object's alignment will have the correct information. rdar://12713765 llvm-svn: 172027	2013-01-10 01:10:10 +00:00
Nadav Rotem	b1791a75cd	ARM Cost model: Use the size of vector registers and widest vectorizable instruction to determine the max vectorization factor. llvm-svn: 172010	2013-01-09 22:29:00 +00:00
Evan Cheng	5652a8df32	Fix a DAG combine bug visitBRCOND() is transforming br(xor(x, y)) to br(x != y). It cahced XOR's operands before calling visitXOR() but failed to update the operands when visitXOR changed the XOR node. rdar://12968664 llvm-svn: 171999	2013-01-09 20:56:40 +00:00
Benjamin Kramer	130fcde3e5	LICM: Hoist insertvalue/extractvalue out of loops. Fixes PR14854. llvm-svn: 171984	2013-01-09 18:12:03 +00:00
Adhemerval Zanella	1ae2248e14	PowerPC: EH adjustments This patch adjust the r171506 to make all DWARF enconding pc-relative for PPC64. It also adds the R_PPC64_REL32 relocation handling in MCJIT (since the eh_frame will not generate PIC-relative relocation) and also adds the emission of stubs created by the TTypeEncoding. llvm-svn: 171979	2013-01-09 17:08:15 +00:00
Nadav Rotem	3f5825c6c1	add -march to the test llvm-svn: 171956	2013-01-09 07:04:23 +00:00
Nadav Rotem	977e0be4a0	Efficient lowering of vector sdiv when the divisor is a splatted power of two constant. PR 14848. The lowered sequence is based on the existing sequence the target-independent DAG Combiner creates for the scalar case. Patch by Zvi Rackover. llvm-svn: 171953	2013-01-09 05:14:33 +00:00
Andrew Trick	9f0b95f260	MIsched: add an ILP window property to machine model. This was an experimental option, but needs to be defined per-target. e.g. PPC A2 needs to aggressively hide latency. I converted some in-order scheduling tests to A2. Hal is working on more test cases. llvm-svn: 171946	2013-01-09 03:36:49 +00:00
Nadav Rotem	4c66f87e8e	ARM Cost Model: Add a basic vectorization unrolling test. llvm-svn: 171931	2013-01-09 01:29:07 +00:00
Nadav Rotem	30a65bc39e	Remove the -licm pass from the loop vectorizer test because the loop vectorizer does it now. llvm-svn: 171930	2013-01-09 01:20:59 +00:00
Nadav Rotem	b696c36fcd	Cost Model: Move the 'max unroll factor' variable to the TTI and add initial Cost Model support on ARM. llvm-svn: 171928	2013-01-09 01:15:42 +00:00
Shuxin Yang	f0537ab681	Consider expression "0.0 - X" as the negation of X if - this expression is explicitly marked no-signed-zero, or - no-signed-zero of this expression can be derived from some context. llvm-svn: 171922	2013-01-09 00:13:41 +00:00
Tim Northover	90fb75d859	Specify complete triple for fp128 tests. This avoids FileCheck failing over different comment characters in assembly (notably powerpc64 on Linux vs Darwin) and should fix David's build-bot. llvm-svn: 171886	2013-01-08 19:36:33 +00:00
Jack Carter	c3dd91c4d7	This patch produces the correct addend value for an R_MIPS_GPREL16 relocation. Contributer: Jack Carter llvm-svn: 171882	2013-01-08 19:01:28 +00:00
Jack Carter	9e28cd3fad	This patch produces the correct pointer size value in the 64 bit .eh_frame section. It doesn't however allow exception handling to work yet since it depends on the correct relocation model being set in the ELF header flags. Contributer: Jack Carter llvm-svn: 171881	2013-01-08 18:53:20 +00:00
Preston Gurd	a01daace88	Pad Short Functions for Intel Atom The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. This patch has been updated to address Nadav's review comments - Optimize only at >= O1 and don't do optimization if -Os is set - Stores MachineBasicBlock* instead of BBNum - Uses DenseMap instead of std::map - Fixes placement of braces Patch by Andy Zhang. llvm-svn: 171879	2013-01-08 18:27:24 +00:00
Tim Northover	7bb9992cce	Allow the asm printer to print fp128 values properly. llvm-svn: 171866	2013-01-08 16:56:23 +00:00
Bill Wendling	76c6521ba1	Make sure we don't emit instructions before a landingpad instruction. PR14782 llvm-svn: 171846	2013-01-08 10:51:32 +00:00
Eric Christopher	95de6bd469	Add the C testcase to this file. Suggested by Dave Blaikie. llvm-svn: 171839	2013-01-08 03:03:14 +00:00
Eric Christopher	72a529566c	Remove the llvm-local DW_TAG_vector_type tag and add a test to make sure that vector types do work. llvm-svn: 171833	2013-01-08 01:53:52 +00:00
David Blaikie	5c0b298b91	Mark artificial types as such in the annotated debug output. llvm-svn: 171826	2013-01-08 00:31:02 +00:00
Nadav Rotem	5a197c06f3	LoopVectorizer: Add support for floating point reductions llvm-svn: 171812	2013-01-07 23:13:00 +00:00
Eli Bendersky	3e76eb2dbc	Add some additional tests for the .bundle_lock align_to_end feature that didn't make into the last commit. Also, update the test-generation script to generate an exhaustive test for align_to_end as well, and include the generated test. llvm-svn: 171811	2013-01-07 23:12:59 +00:00
Nadav Rotem	c60d7d96f5	LoopVectorizer: When we vectorizer and widen loops we process many elements at once. This is a good thing, except for small loops. On small loops post-loop that handles scalars (and runs slower) can take more time to execute than the rest of the loop. This patch disables widening of loops with a small static trip count. llvm-svn: 171798	2013-01-07 21:54:51 +00:00
Eli Bendersky	802b62871e	Add the align_to_end option to .bundle_lock in the MC implementation of aligned bundling. The document describing this feature and the implementation has also been updated: https://sites.google.com/a/chromium.org/dev/nativeclient/pnacl/aligned-bundling-support-in-llvm llvm-svn: 171797	2013-01-07 21:51:08 +00:00
Shuxin Yang	df0e61e793	This change is to implement following rules: o. X/C1 * C2 => X * (C2/C1) (if C2/C1 is neither special FP nor denormal) o. X/C1 * C2 -> X/(C1/C2) (if C2/C1 is either specical FP or denormal, but C1/C2 is a normal Fp) Let MDC denote multiplication or dividion with one & only one operand being a constant o. (MDC ± C1) * C2 => (MDC * C2) ± (C1 * C2) (so long as the constant-folding doesn't yield any denormal or special value) llvm-svn: 171793	2013-01-07 21:39:23 +00:00
Eric Christopher	2cbd5767ad	Add support for separating strings for the split debug info DWARF5 proposal. This leaves the strings in the skeleton die as strp, but in all dwo files they're accessed now via DW_FORM_GNU_str_index. Add support for dumping these sections and modify the fission-cu.ll testcase to have the correct strings and form. Fix a small bug in the fixed form sizes routine that involved out of array accesses for the table and add a FIXME in the extractFast routine to fix this up. llvm-svn: 171779	2013-01-07 19:32:41 +00:00
Bill Schmidt	9b1e3e25dc	This patch addresses bug 14678 by fixing two problems in medium code model code generation. Variables addressed through a GlobalAlias were not being handled, and variables with available_externally linkage were treated incorrectly. The patch contains two new tests to verify the correct code generation for these cases. llvm-svn: 171778	2013-01-07 19:29:18 +00:00
Quentin Colombet	3b2db0bcd3	When code size is the priority (Oz, MinSize attribute), help llvm turning a code like this: if (foo) free(foo) into that: free(foo) Move a call to free from basic block FB into FB's predecessor, P, when the path from P to FB is taken only if the argument of free is not equal to NULL. Some restrictions apply on P and FB to be sure that this code motion is profitable. Namely: 1. FB must have only one predecessor P. 2. FB must contain only the call to free plus an unconditional branch to S. 3. P's successors are FB and S. Because of 1., we will not increase the code size when moving the call to free from FB to P. Because of 2., FB will be empty after the move. Because of 2. and 3., P's branch instruction becomes useless, so as FB (simplifycfg will do the job). llvm-svn: 171762	2013-01-07 18:37:41 +00:00
David Blaikie	8a9b6a3681	Make test/DebugInfo/member-pointers.ll portable by removing the TargetData llvm-svn: 171759	2013-01-07 17:52:49 +00:00
Chandler Carruth	26c59fa870	Switch the SCEV expander and LoopStrengthReduce to use TargetTransformInfo rather than TargetLowering, removing one of the primary instances of the layering violation of Transforms depending directly on Target. This is a really big deal because LSR used to be a "special" pass that could only be tested fully using llc and by looking at the full output of it. It also couldn't run with any other loop passes because it had to be created by the backend. No longer is this true. LSR is now just a normal pass and we should probably lift the creation of LSR out of lib/CodeGen/Passes.cpp and into the PassManagerBuilder. =] I've not done this, or updated all of the tests to use opt and a triple, because I suspect someone more familiar with LSR would do a better job. This change should be essentially without functional impact for normal compilations, and only change behvaior of targetless compilations. The conversion required changing all of the LSR code to refer to the TTI interfaces, which fortunately are very similar to TargetLowering's interfaces. However, it also allowed us to always expect to have some implementation around. I've pushed that simplification through the pass, and leveraged it to simplify code somewhat. It required some test updates for one of two things: either we used to skip some checks altogether but now we get the default "no" answer for them, or we used to have no information about the target and now we do have some. I've also started the process of removing AddrMode, as the TTI interface doesn't use it any longer. In some cases this simplifies code, and in others it adds some complexity, but I think it's not a bad tradeoff even there. Subsequent patches will try to clean this up even further and use other (more appropriate) abstractions. Yet again, almost all of the formatting changes brought to you by clang-format. =] llvm-svn: 171735	2013-01-07 14:41:08 +00:00
David Tweed	3f90937535	Fix a mistaken commit that included some debugging code. llvm-svn: 171734	2013-01-07 13:41:55 +00:00
David Tweed	a11edf0ce3	There was a switch fall-through in the parser for textual LLVM that caused bogus comparison operands to default to eq/oeq. Fix that, fix a couple of tests that accidentally passed and test for bogus comparison opeartors explicitly. llvm-svn: 171733	2013-01-07 13:32:38 +00:00
Silviu Baranga	a055aab506	Make the MergeGlobals pass correctly handle the address space qualifiers of the global variables. We partition the set of globals by their address space, and apply the same the trasnformation as before to merge them. llvm-svn: 171730	2013-01-07 12:31:25 +00:00
Chandler Carruth	7383bfd67e	Switch BBVectorize to directly depend on having a TTI analysis. This could be simplified further, but Hal has a specific feature for ignoring TTI, and so I preserved that. Also, I needed to use it because a number of tests fail when switching from a null TTI to the NoTTI nonce implementation. That seems suspicious to me and so may be something that you need to look into Hal. I worked it by preserving the old behavior for these tests with the flag that ignores all target info. llvm-svn: 171722	2013-01-07 10:22:36 +00:00
David Blaikie	5d3249b554	PR14759: Debug info support for C++ member pointers. This works fine with GDB for member variable pointers, but GDB's support for member function pointers seems to be quite unrelated to DW_TAG_ptr_to_member_type. (see GDB bug 14998 for details) llvm-svn: 171698	2013-01-07 05:51:15 +00:00
Craig Topper	4f1c7256f9	Fix suffix handling for parsing and printing of cvtsi2ss, cvtsi2sd, cvtss2si, cvttss2si, cvtsd2si, and cvttsd2si to match gas behavior. cvtsi2* should parse with an 'l' or 'q' suffix or no suffix at all. No suffix should be treated the same as 'l' suffix. Printing should always print a suffix. Previously we didn't parse or print an 'l' suffix. cvtt2si/cvt2si should parse with an 'l' or 'q' suffix or not suffix at all. No suffix should use the destination register size to choose encoding. Printing should not print a suffix. Original 'l' suffix issue with cvtsi2* pointed out by Michael Kuperstein. llvm-svn: 171668	2013-01-06 20:39:29 +00:00
Evan Cheng	3fb03e23a4	Fix for PR14739. It's not safe to fold a load into a call across a store. Thanks to Nick Lewycky for the initial patch. llvm-svn: 171665	2013-01-06 19:00:15 +00:00
Andrew Trick	f950ce8e38	Fix a crash in LSR replaceCongruentIVs. Indirect branch in the preheader crashes replaceCongruentIVs. Fixes rdar://12910141. llvm-svn: 171653	2013-01-06 05:59:39 +00:00
Michael J. Spencer	c445408710	[Object][ELF] Fix incorrect size of members for the 64 version of Elf_Phdr_Impl. llvm-svn: 171650	2013-01-06 03:57:11 +00:00
Michael J. Spencer	209565db2d	[objdump] Add --private-headers, -p. This currently prints the ELF program headers. llvm-svn: 171649	2013-01-06 03:56:49 +00:00
David Blaikie	e05754576b	Include access modifiers in subprogram metadata IR comment. Based on code review feedback in r171604 from Chandler Carruth & Eric Christopher. llvm-svn: 171636	2013-01-05 21:39:33 +00:00
David Blaikie	800a916f99	Emit DW_TAG_formal_parameter for unnamed parameters. This change essentially reverts r87069 which came without a test case. It causes no regressions in the GDB 7.5 test suite & fixes 25 xfails (commit to the test suite to follow). If anyone can present a test case that demonstrates why this check is necessary I'd be happy to account for it in one way or another. llvm-svn: 171609	2013-01-05 07:43:02 +00:00
Craig Topper	92a70b1e65	Recommit r171461 which was incorrectly reverted. Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks. llvm-svn: 171608	2013-01-05 07:39:25 +00:00
Nadav Rotem	478b6a47ec	Revert revision 171524. Original message: URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev Log: The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. llvm-svn: 171603	2013-01-05 05:42:48 +00:00
Nadav Rotem	f19d515316	Fix a typo. Remove the duplicated test. llvm-svn: 171584	2013-01-05 01:17:46 +00:00
Nadav Rotem	e9f5bfd5e9	iLoopVectorize: Non commutative operators can be used as reduction variables as long as the reduction chain is used in the LHS. PR14803. llvm-svn: 171583	2013-01-05 01:15:47 +00:00
Nadav Rotem	6d9dafe3ff	Force a fixed unroll count on the target independent tests. This should fix clang-native-arm-cortex-a9. Thanks Renato. llvm-svn: 171582	2013-01-05 00:58:48 +00:00
Andrew Trick	18021a45aa	tabs-to-spaces llvm-svn: 171550	2013-01-04 23:11:35 +00:00
Paul Redmond	874f01e956	Do not vectorize loops with subtraction reductions Since subtraction does not commute the loop vectorizer incorrectly vectorizes reductions such as x = A[i] - x. Disabling for now. llvm-svn: 171537	2013-01-04 22:10:16 +00:00
Eric Christopher	cad9b53c02	Add a name for the anonymous type we're creating for subrange types and a FIXME for what we should be doing. Should solve the immediacy of PR12069 where our debug info is crashing another tool. llvm-svn: 171536	2013-01-04 21:51:53 +00:00
Preston Gurd	e36b685a94	The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. llvm-svn: 171524	2013-01-04 20:54:54 +00:00
Michael J. Spencer	bae14cef80	[Object][ELF] Add a maximum alignment. This is used by createELFObjectFile to create a properly aligned reader. llvm-svn: 171520	2013-01-04 20:36:28 +00:00
Akira Hatanaka	b13b33359b	[mips] MipsTargetLowering::getSetCCResultType should return a vector type if vectors are being compared. llvm-svn: 171517	2013-01-04 20:06:01 +00:00
Manman Ren	fe5a61edbe	Memory Dependence Analysis: fix a miscompile that uses DT to approxmiate the reachablity. We conservatively approximate the reachability analysis by saying it is not reachable if there is a single path starting from "From" and the path does not reach "To". rdar://12801584 llvm-svn: 171512	2013-01-04 19:19:47 +00:00
Adhemerval Zanella	9b0b781395	PowerPC: Fix eh_frame relocation for PIC This patch fixes the PPC eh_frame definitions for the personality and frame unwinding for PIC objects. It makes PIC build correctly creates relative relocations in the '.rela.eh_frame' segments and thus avoiding a text relocation that generates a DT_TEXTREL segments in link phase. llvm-svn: 171506	2013-01-04 19:08:13 +00:00
Nadav Rotem	e1d5c4b8b9	LoopVectorizer: 1. Add code to estimate register pressure. 2. Add code to select the unroll factor based on register pressure. 3. Add bits to TargetTransformInfo to provide the number of registers. llvm-svn: 171469	2013-01-04 17:48:25 +00:00
Nadav Rotem	c616a5408a	Revert revision: 171467. This transformation is incorrect and makes some tests fail. Original message: Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1. Added a test. llvm-svn: 171468	2013-01-04 17:35:21 +00:00
Elena Demikhovsky	5f2f06d2d9	Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1. Added a test. llvm-svn: 171467	2013-01-03 08:48:33 +00:00
Michael Gottesman	820aac1c78	Revert "Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks." This reverts commit r171461 since it breaks the following tests: Clang :: Analysis/outofbound-notwork.c Clang :: Analysis/string-fail.c Clang :: CXX/basic/basic.lookup/basic.lookup.qual/p6-0x.cpp Clang :: CXX/basic/basic.lookup/basic.lookup.unqual/p15.cpp Clang :: CXX/dcl.dcl/dcl.spec/dcl.fct.spec/p4.cpp Clang :: CXX/dcl.dcl/dcl.spec/dcl.stc/p10.cpp Clang :: CXX/temp/temp.param/p14.cpp Clang :: CXX/temp/temp.res/temp.dep.res/temp.point/p1.cpp Clang :: CodeGen/2009-02-13-zerosize-union-field-ppc.c Clang :: CodeGen/blocks-2.c Clang :: CodeGen/libcalls-d.c Clang :: CodeGen/libcalls-ld.c Clang :: CodeGenCXX/conversion-function.cpp Clang :: CodeGenCXX/debug-info-limit-type.cpp Clang :: CodeGenCXX/inheriting-constructor.cpp Clang :: FixIt/fixit-errors.c Clang :: FixIt/fixit-pmem.cpp Clang :: Modules/namespaces.cpp Clang :: PCH/changed-files.c Clang :: PCH/pr4489.c Clang :: PCH/source-manager-stack.c Clang :: Parser/cxx-ambig-decl-expr-xfail.cpp Clang :: SemaCXX/switch-implicit-fallthrough-cxx98.cpp Clang :: SemaTemplate/instantiate-function-1.mm llvm-svn: 171466	2013-01-03 08:18:30 +00:00
Craig Topper	7c27cc9fd0	Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks. llvm-svn: 171461	2013-01-03 06:40:20 +00:00
Nadav Rotem	d554a517c0	LoopVectorizer: Test the unrolling flag. llvm-svn: 171446	2013-01-03 01:47:31 +00:00
Michael J. Spencer	e0219f78d3	[Object] Temporarily disable these tests. They are failing because archives create unaligned ELF files. The recent Endian change added a __builtin_unreachable() when this happens. I will be committing a fix for this soon. llvm-svn: 171438	2013-01-03 01:24:32 +00:00
Jakob Stoklund Olesen	725d57682b	Fix PR14732 by handling all kinds of IMPLICIT_DEF live ranges. Most IMPLICIT_DEF instructions are removed by the ProcessImplicitDefs pass, and a few are reinserted by PHIElimination when a PHI argument is <undef>. RegisterCoalescer was assuming that all IMPLICIT_DEF live ranges look like those created by PHIElimination, and that their live range never leaves the basic block. The PR14732 test case does tricks with PHI nodes that causes a longer IMPLICIT_DEF live range to appear. This happens very rarely, but RegisterCoalescer should be able to handle it. llvm-svn: 171435	2013-01-03 00:47:51 +00:00
Nadav Rotem	4897392360	Avoid vectorization when the function has the "noimplicitflot" attribute. llvm-svn: 171429	2013-01-02 23:54:43 +00:00
Eric Christopher	da4b2195fc	Extend the dumping infrastructure to deal with additional sections for debug info. These are some of the dwo sections from the DWARF5 split debug info proposal. Update the fission-cu.ll testcase to show what we should be able to dump more of now. Work in progress: Ultimately the relocations will be gone for the dwo section and the strings will be a different form (as well as the rest of the sections will be included). llvm-svn: 171428	2013-01-02 23:52:13 +00:00
Tom Stellard	567f886eb0	DAGCombiner: Avoid generating illegal vector INT_TO_FP nodes DAGCombiner::reduceBuildVecConvertToConvertBuildVec() was making two mistakes: 1. It was checking the legality of scalar INT_TO_FP nodes and then generating vector nodes. 2. It was passing the result value type to TargetLoweringInfo::getOperationAction() when it should have been passing the value type of the first operand. llvm-svn: 171420	2013-01-02 22:13:01 +00:00
Kevin Enderby	726e0ea6eb	Adds missing aliases for fcom and fcomp instructions without arguments. Patch by Michael M Kuperstein! llvm-svn: 171414	2013-01-02 21:20:15 +00:00
Nadav Rotem	c8d7047fa9	AVX: Fix a bug in WidenMaskArithmetic. llvm-svn: 171397	2013-01-02 17:40:39 +00:00
Dmitri Gribenko	86fb558d9a	Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID This is done to avoid odd test failures, like the one fixed in r171243. While there, FileCheck'ize tests. llvm-svn: 171344	2013-01-01 14:04:36 +00:00
Dmitri Gribenko	d7beca87f5	Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID This is done to avoid odd test failures, like the one fixed in r171243. My previous regex was not good enough to find these. llvm-svn: 171343	2013-01-01 13:57:25 +00:00
Nadav Rotem	b1615b1ac4	Make opt grab the triple from the module and use it to initialize the target machine. llvm-svn: 171341	2013-01-01 08:00:32 +00:00
Nuno Lopes	d896a400f1	recommit r171298 (add support for PHI nodes to ObjectSizeOffsetVisitor). Hopefully with bugs corrected now. llvm-svn: 171325	2012-12-31 20:45:10 +00:00
Benjamin Kramer	af463573cb	Revert "add support for PHI nodes to ObjectSizeOffsetVisitor" This reverts r171298. Breaks clang selfhost. llvm-svn: 171318	2012-12-31 19:51:10 +00:00
Jakub Staszak	c48bbe7170	Add extra CHECK to make sure that 'or' instruction was replaced. Also add an assert to avoid confusion in the code where is known that C1 <= C2. llvm-svn: 171310	2012-12-31 18:26:42 +00:00
Rafael Espindola	c8288c103d	Fix bits check in ELFObjectFile::isSectionZeroInit(). Fixes PR14723. Patch by Sami Liedes! llvm-svn: 171309	2012-12-31 18:20:51 +00:00
Rafael Espindola	21bd841d27	Dump sections. Extracted from a patch by Sami Liedes. llvm-svn: 171304	2012-12-31 16:29:44 +00:00
Rafael Espindola	144af2cb4d	Print a header above the symbols. Extracted from a patch by Sami Liedes. llvm-svn: 171302	2012-12-31 16:05:21 +00:00
Nuno Lopes	7ab7c02d23	add support for PHI nodes to ObjectSizeOffsetVisitor llvm-svn: 171298	2012-12-31 13:52:36 +00:00
Chris Lattner	f5cca68c2c	Fix LICM's memory promotion optimization to preserve TBAA tags when promoting a store in a loop. This was noticed when working on PR14753, but isn't directly related. llvm-svn: 171281	2012-12-31 08:37:17 +00:00
Chris Lattner	eeefe1bc07	teach instcombine to preserve TBAA tag when merging two stores, part of PR14753 llvm-svn: 171279	2012-12-31 08:10:58 +00:00
Jakub Staszak	ea2b9b9d67	Transform (A == C1 \|\| A == C2) into (A & ~(C1 ^ C2)) == C1 if C1 and C2 differ only with one bit. Fixes PR14708. llvm-svn: 171270	2012-12-31 00:34:55 +00:00
Hal Finkel	6dbdd4307b	Support ppcf128 in SelectionDAG::getConstantFP Fixes pr14751. Patch by Kai; Thanks! llvm-svn: 171261	2012-12-30 19:03:32 +00:00
Nadav Rotem	0b37f14371	LoopVectorizer: Fix a bug in the code that updates the loop exiting block. LCSSA PHIs may have undef values. The vectorizer updates values that are used by outside users such as PHIs. The bug happened because undefs are not loop values. This patch handles these PHIs. PR14725 llvm-svn: 171251	2012-12-30 07:47:00 +00:00
Dmitri Gribenko	56bf2e1830	Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID This is done to avoid odd test failures, like the one fixed in r171243. llvm-svn: 171250	2012-12-30 02:33:22 +00:00
Dmitri Gribenko	10c4b4d249	Add a check to the test Analysis/ScalarEvolution/2010-09-03-RequiredTransitive.ll This test did not test anything at all (except for opt crashing, but that was not the reason why it was added). llvm-svn: 171248	2012-12-30 01:42:34 +00:00
Dmitri Gribenko	b137c9e551	Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID This is done to avoid odd test failures, like the one fixed in r171243. llvm-svn: 171246	2012-12-30 01:28:40 +00:00
NAKAMURA Takumi	5a495a5c96	llvm/test/Transforms/GVN/null-aliases-nothing.ll: Fix a RUN line not to emit ModuleID. Larry Evans reported it fails if source tree contains "load", like "download". llvm-svn: 171243	2012-12-30 00:33:26 +00:00
Chandler Carruth	86ed53089f	Fix a stunning oversight in the inline cost analysis. It was never propagating one of the values it simplified to a constant across a myriad of instructions. Notably, ptrtoint instructions when we had a constant pointer (say, 0) didn't propagate that, blocking a massive number of down-stream optimizations. This was uncovered when investigating why we fail to inline and delete the boilerplate in: void f() { std::vector<int> v; v.push_back(1); } It turns out most of the efforts I've made thus far to improve the analysis weren't making it far purely because of this. After this is fixed, the store-to-load forwarding patch enables LLVM to optimize the above to an empty function. We still can't nuke a second push_back, but for different reasons. There is a very real chance this will cause somewhat noticable changes in inlining behavior, so please let me know if you see regressions (or improvements!) because of this patch. llvm-svn: 171196	2012-12-28 14:43:42 +00:00
Chandler Carruth	753e21d057	Teach the inline cost analysis about calls that can be simplified and how to propagate constants through insert and extract value instructions. With the recent improvements to instsimplify, this allows inline cost analysis to constant fold through intrinsic functions, including notably the with.overflow intrinsic math routines which often show up inside of STL abstractions. This is yet another piece in the puzzle of breaking down the code for: void f() { std::vector<int> v; v.push_back(1); } But it still isn't enough. There are a pile of bugs in inline cost still blocking this. llvm-svn: 171195	2012-12-28 14:23:32 +00:00
Chandler Carruth	f6182155f6	Teach instsimplify to use the constant folder where appropriate for constant folding calls. Add the initial tests for this which show that now instsimplify can simplify blindingly obvious code patterns expressed with both intrinsics and library calls. llvm-svn: 171194	2012-12-28 14:23:29 +00:00
Nadav Rotem	3da9ac72fa	AVX: Move the ZEXT/ANYEXT DAGCo optimizations to the lowering of these optimizations. The old test cases still cover all of these lowering/optimizations. The single change that we have is that now anyext does not need to zero a register, because it does not use the exact code path as the zero_extend. llvm-svn: 171178	2012-12-28 05:45:24 +00:00
Alexey Samsonov	29dd7f2090	[ASan] Fix lifetime intrinsics handling. Now for each intrinsic we check if it describes one of 'interesting' allocas. Assume that allocas can go through casts and phi-nodes before apperaring as llvm.lifetime arguments llvm-svn: 171153	2012-12-27 08:50:58 +00:00
Nadav Rotem	2a054b4475	On AVX/AVX2 the type v8i1 is legalized to v8i16, which is an XMM sized register. In most cases we actually compare or select YMM-sized registers and mixing the two types creates horrible code. This commit optimizes some of the transition sequences. PR14657. llvm-svn: 171148	2012-12-27 08:15:45 +00:00
Eric Christopher	3bf29fda91	For the dwarf5 split debug info code split out the string section per compile unit/skeleton compile unit. Update tests accordingly. llvm-svn: 171133	2012-12-27 02:14:01 +00:00
Eric Christopher	c8a88ee691	FileCheck-ize. llvm-svn: 171132	2012-12-27 02:13:58 +00:00
Eric Christopher	d6152aabbb	FileCheck-ize. llvm-svn: 171131	2012-12-27 02:13:55 +00:00
Eric Christopher	5a6acfa4c8	Right now all of the relocations are 32-bit dwarf, and the relocation information doesn't return an addend for Rel relocations. Go ahead and use this information to fix relocation handling inside dwarfdump for 32-bit ELF REL. llvm-svn: 171126	2012-12-27 01:07:07 +00:00
Nadav Rotem	5350cd314b	If all of the write objects are identified then we can vectorize the loop even if the read objects are unidentified. PR14719. llvm-svn: 171124	2012-12-26 23:30:53 +00:00
Nadav Rotem	3f7c4f36ba	LoopVectorizer: Optimize the vectorization of consecutive memory access when the iteration step is -1 llvm-svn: 171114	2012-12-26 19:08:17 +00:00
Evgeniy Stepanov	5eb5bf8b46	[msan] Raise alignment of origin stores/loads when possible. Origin alignment is as high as the alignment of the corresponding application location, but never less than 4. llvm-svn: 171110	2012-12-26 11:55:09 +00:00
NAKAMURA Takumi	40aa3285f4	llvm/test/CodeGen/X86: FileCheck-ize two tests in r171083. llvm-svn: 171084	2012-12-26 03:19:30 +00:00
NAKAMURA Takumi	334f685328	llvm/test/CodeGen/X86: Disable avx in two tests corresponding to r171082. llvm-svn: 171083	2012-12-26 03:08:55 +00:00
Hal Finkel	30e95a8ebb	BBVectorize: Use VTTI to compute costs for intrinsics vectorization For the time being this includes only some dummy test cases. Once the generic implementation of the intrinsics cost function does something other than assuming scalarization in all cases, or some target specializes the interface, some real test cases can be added. Also, for consistency, I changed the type of IID from unsigned to Intrinsic::ID in a few other places. llvm-svn: 171079	2012-12-26 01:36:57 +00:00
Hal Finkel	b44f890133	LoopVectorize: Enable vectorization of the fmuladd intrinsic llvm-svn: 171076	2012-12-25 23:21:29 +00:00
Hal Finkel	2a456112ec	BBVectorize: Enable vectorization of the fmuladd intrinsic llvm-svn: 171075	2012-12-25 22:36:08 +00:00
Hal Finkel	2ebe6d08cd	Loosen scheduling restrictions on the PPC dcbt intrinsic As with the prefetch intrinsic to which it maps, simply have dcbt marked as reading from and writing to its arguments instead of having unmodeled side effects. While this might cause unwanted code motion (because aliasing checks don't really capture cache-line sharing), it is more important that prefetches in unrolled loops don't block the scheduler from rearranging the unrolled loop body. llvm-svn: 171073	2012-12-25 18:51:18 +00:00
Hal Finkel	1b5ff08d43	Expand PPC64 atomic load and store Use of store or load with the atomic specifier on 64-bit types would cause instruction-selection failures. As with the 32-bit case, these can use the default expansion in terms of cmp-and-swap. llvm-svn: 171072	2012-12-25 17:22:53 +00:00
Evgeniy Stepanov	f19c086d1e	[msan] Fix handling of vectors of pointers. VectorType::getInteger() can not be used with them, because pointer size depends on the target. llvm-svn: 171070	2012-12-25 16:04:38 +00:00
Evgeniy Stepanov	ec8371283b	[msan] Fix handling of select with vector condition. llvm-svn: 171069	2012-12-25 14:56:21 +00:00
Benjamin Kramer	a9f265ee98	Harden test so it's not affected by changes to compare lowering. This only failed on hosts that don't have SSE41. llvm-svn: 171066	2012-12-25 13:23:23 +00:00
Benjamin Kramer	81b5a8fd2e	X86: Shave off one shuffle from the pcmpeqq sequence for SSE2 by making use of and commutativity. llvm-svn: 171064	2012-12-25 13:09:08 +00:00
Benjamin Kramer	df4af41b9b	X86: Custom lower <2 x i64> eq and ne when SSE41 is not available. pcmpeqd, pshufd, pshufd, pand is cheaper than unpack + cmpq, sbbq, cmpq, sbbq + pack. Small speedup on loop-vectorized viterbi (-march=core2). llvm-svn: 171063	2012-12-25 12:54:19 +00:00
Nick Lewycky	fb43258080	Fix typo "Makre" -> "Make". llvm-svn: 171043	2012-12-24 19:55:47 +00:00
NAKAMURA Takumi	1b18db7ea3	llvm/test/CodeGen/X86/fold-vex.ll: Add explicit triple. llvm-svn: 171029	2012-12-24 11:14:06 +00:00
Nadav Rotem	dc0ad92b64	Some x86 instructions can load/store one of the operands to memory. On SSE, this memory needs to be aligned. When these instructions are encoded in VEX (on AVX) there is no such requirement. This changes the folding tables and removes the alignment restrictions from VEX-encoded instructions. llvm-svn: 171024	2012-12-24 09:40:33 +00:00
Nadav Rotem	5f7c12cfbd	LoopVectorizer: When checking for vectorizable types, also check the StoreInst operands. PR14705. llvm-svn: 171023	2012-12-24 09:14:18 +00:00
Nadav Rotem	bd5d1d832a	LoopVectorizer: Fix an endless loop in the code that looks for reductions. The bug was in the code that detects PHIs in if-then-else block sequence. PR14701. llvm-svn: 171008	2012-12-24 01:22:06 +00:00
Nadav Rotem	cf9999d9d5	CostModel: Change the default target-independent implementation for finding the cost of arithmetic functions. We now assume that the cost of arithmetic operations that are marked as Legal or Promote is low, but ops that are marked as custom are higher. llvm-svn: 171002	2012-12-23 17:31:23 +00:00
Nadav Rotem	aa92ea4f12	We are not ready to estimate the cost of integer expansions based on the number of parts. This test is too noisy. llvm-svn: 170999	2012-12-23 09:11:07 +00:00
Nadav Rotem	2cade68025	Loop Vectorizer: Update the cost model of scatter/gather operations and make them more expensive. llvm-svn: 170995	2012-12-23 07:23:55 +00:00
Benjamin Kramer	76268ac682	X86: Turn mul of <4 x i32> into pmuludq when no SSE4.1 is available. pmuludq is slow, but it turns out that all the unpacking and packing of the scalarized mul is even slower. 10% speedup on loop-vectorized paq8p. llvm-svn: 170985	2012-12-22 16:07:56 +00:00
Benjamin Kramer	b2f0a2bd4b	X86: Emit vector sext as shuffle + sra if vpmovsx is not available. Also loosen the SSSE3 dependency a bit, expanded pshufb + psra is still better than scalarized loads. Fixes PR14590. llvm-svn: 170984	2012-12-22 11:34:28 +00:00
Nadav Rotem	d5aae980cb	In some cases, due to scheduling constraints we copy the EFLAGS. The only way to read the eflags is using push and pop. If we don't adjust the stack then we run over the first frame index. This is not something that we want to do, so we have to make sure that our machine function does not copy the flags. If it does then we have to emit the prolog that adjusts the stack. rdar://12896831 llvm-svn: 170961	2012-12-21 23:48:49 +00:00
Akira Hatanaka	d6b694f036	[mips] Fix encoding of BAL instruction. Also, fix assembler test case which was not catching the error. llvm-svn: 170953	2012-12-21 23:13:59 +00:00
Benjamin Kramer	b4688f84bd	try to unbreak ppc buildbots. llvm-svn: 170913	2012-12-21 18:11:45 +00:00
Benjamin Kramer	82d1c371e2	X86: Match pmin/pmax as a target specific dag combine. This occurs during vectorization. Part of PR14667. llvm-svn: 170908	2012-12-21 17:46:58 +00:00
Tom Stellard	a8b0351720	R600: Expand vec4 INT <-> FP conversions llvm-svn: 170901	2012-12-21 16:33:24 +00:00
Evgeniy Stepanov	4fbc0d08bf	[msan] Remove unreachable blocks before instrumenting a function. llvm-svn: 170883	2012-12-21 11:18:49 +00:00
Nadav Rotem	6d4fdd6d2c	Improve the X86 cost model for loads and stores. llvm-svn: 170830	2012-12-21 01:33:59 +00:00
Reed Kotler	93f778d2bd	Add test case for r170674 llvm-svn: 170823	2012-12-21 00:55:10 +00:00
Nadav Rotem	e7785686a5	Fix a bug in the code that checks if we can vectorize loops while using dynamic memory bound checks. Before the fix we were able to vectorize this loop from the Livermore Loops benchmark: for ( k=1 ; k<n ; k++ ) x[k] = x[k-1] + y[k]; llvm-svn: 170811	2012-12-21 00:07:35 +00:00
Eric Christopher	6e47b725ff	Move these files over to the debug info directory. llvm-svn: 170810	2012-12-21 00:03:42 +00:00
Bob Wilson	7bba4f8957	Revert "Adding support for llvm.arm.neon.vaddl[su].* and" This reverts r170694. The operations can be represented in IR without adding any new intrinsics. llvm-svn: 170765	2012-12-20 21:09:38 +00:00
Nadav Rotem	2ababf68d7	LoopVectorize: Fix a bug in the scalarization of instructions. Before if-conversion we could check if a value is loop invariant if it was declared inside the basic block. Now that loops have multiple blocks this check is incorrect. This fixes External/SPEC/CINT95/099_go/099_go llvm-svn: 170756	2012-12-20 20:24:40 +00:00
Evan Cheng	ddc0cb6dc5	On some ARM cpus, flags setting movs with shifter operand, i.e. lsl, lsr, asr, are more expensive than the non-flag setting variant. Teach thumb2 size reduction pass to avoid generating them unless we are optimizing for size. rdar://12892707 llvm-svn: 170728	2012-12-20 19:59:30 +00:00
Eli Bendersky	4cfb5b9e64	Change Lit error redirection to FileCheck to a more common syntax since it can potentially cause some bots to fail. llvm-svn: 170726	2012-12-20 19:54:02 +00:00
Eli Bendersky	f658e92724	Add a largish auto-generated test for the aligned bundling feature, along with the script generating it. The test should never be modified manually. If anyone needs to change it, please change the script and re-run it. The script is placed into utils/testgen - I couldn't think of a better place, and after some discussion on IRC this looked like a logical location. llvm-svn: 170720	2012-12-20 19:16:57 +00:00
Eli Bendersky	4c4f11eb0d	Tests for the aligned bundling support added in r170718 llvm-svn: 170719	2012-12-20 19:07:30 +00:00
Rafael Espindola	642c7cd56e	Simplify the testcase a bit. I checked that it would still crash llc before the corresponding fix. llvm-svn: 170709	2012-12-20 17:47:27 +00:00
James Molloy	4f6fb953a7	Add a new attribute, 'noduplicate'. If a function contains a noduplicate call, the call cannot be duplicated - Jump threading, loop unrolling, loop unswitching, and loop rotation are inhibited if they would duplicate the call. Similarly inlining of the function is inhibited, if that would duplicate the call (in particular inlining is still allowed when there is only one callsite and the function has internal linkage). llvm-svn: 170704	2012-12-20 16:04:27 +00:00
Renato Golin	6b2ea4a48f	Adding support for llvm.arm.neon.vaddl[su].* and llvm.arm.neon.vsub[su].* intrinsics. Patch by Pete Couperus <pjcoup@gmail.com> llvm-svn: 170694	2012-12-20 13:52:11 +00:00
Reed Kotler	d019dbf75e	fix most of remaining issues with large frames. these patches are tested a lot by test-suite but make check tests are forthcoming once the next few patches that complete this are committed. with the next few patches the pass rate for mips16 is near 100% llvm-svn: 170656	2012-12-20 04:07:42 +00:00
Akira Hatanaka	f423672117	[mips] Use "or $r0, $r1, $zero" instead of "addu $r0, $zero, $r1" to copy physical register $r1 to $r0. GNU disassembler recognizes an "or" instruction as a "move", and this change makes the disassembled code easier to read. Original patch by Reed Kotler. llvm-svn: 170655	2012-12-20 04:06:06 +00:00
Bob Wilson	3365b80290	Do not introduce vector operations in functions marked with noimplicitfloat. <rdar://problem/12879313> llvm-svn: 170630	2012-12-20 01:36:20 +00:00
Eric Christopher	3c5a1914b6	Split out abbreviations for the skeleton info from the rest of the abbreviations. Part of implementing split dwarf. llvm-svn: 170589	2012-12-19 22:02:53 +00:00
Evan Cheng	eae6d2ccea	LLVM sdisel normalize bit extraction of the form: ((x & 0xff00) >> 8) << 2 to (x >> 6) & 0x3fc This is general goodness since it folds a left shift into the mask. However, the trailing zeros in the mask prevents the ARM backend from using the bit extraction instructions. And worse since the mask materialization may require an addition instruction. This comes up fairly frequently when the result of the bit twiddling is used as memory address. e.g. = ptr[(x & 0xFF0000) >> 16] We want to generate: ubfx r3, r1, #16, #8 ldr.w r3, [r0, r3, lsl #2] vs. mov.w r9, #1020 and.w r2, r9, r1, lsr #14 ldr r2, [r0, r2] Add a late ARM specific isel optimization to ARMDAGToDAGISel::PreprocessISelDAG(). It folds the left shift to the 'base + offset' address computation; change the mask to one which doesn't have trailing zeros and enable the use of ubfx. Note the optimization has to be done late since it's target specific and we don't want to change the DAG normalization. It's also fairly restrictive as shifter operands are not always free. It's only done for lsh 1 / 2. It's known to be free on some cpus and they are most common for address computation. This is a slight win for blowfish, rijndael, etc. rdar://12870177 llvm-svn: 170581	2012-12-19 20:16:09 +00:00
Roman Divacky	e3d323052f	Remove edis - the enhanced disassembler. Fixes PR14654. llvm-svn: 170578	2012-12-19 19:55:47 +00:00
Paul Redmond	5917f4c715	Transform (x&C)>V into (x&C)!=0 where possible When the least bit of C is greater than V, (x&C) must be greater than V if it is not zero, so the comparison can be simplified. Although this was suggested in Target/X86/README.txt, it benefits any architecture with a directly testable form of AND. Patch by Kevin Schoedel llvm-svn: 170576	2012-12-19 19:47:13 +00:00
Benjamin Kramer	c5071466d4	PowerPC: Expand VSELECT nodes. There's probably a better expansion for those nodes than the default for altivec, but this is better than crashing. VSELECTs occur in loop vectorizer output. llvm-svn: 170551	2012-12-19 15:49:14 +00:00
Benjamin Kramer	ae0bb61053	Make TargetLowering::getTypeConversion more resilient against odd illegal MVTs. - An MVT can become an EVT when being split (e.g. v2i8 -> v1i8, the latter doesn't exist) - Return the scalar value when an MVT is scalarized (v1i64 -> i64) Fixes PR14639ff. llvm-svn: 170546	2012-12-19 14:34:28 +00:00
Evgeniy Stepanov	d7571cd4bc	[msan] Heuristically instrument unknown intrinsics. This changes adds shadow and origin propagation for unknown intrinsics by examining the arguments and ModRef behaviour. For now, only 3 classes of intrinsics are handled: - those that look like simple SIMD store - those that look like simple SIMD load - those that don't have memory effects and look like arithmetic/logic/whatever operation on simple types. llvm-svn: 170530	2012-12-19 11:22:04 +00:00

... 2 3 4 5 6 ...

18199 Commits