llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Gottesman	e283e1958a	[TRE] Merged several tests into the the test basic.ll. llvm-svn: 185723	2013-07-05 20:45:13 +00:00
Arnold Schwaighofer	97c1343c45	ARM: Add a pack pattern for matching arithmetic shift right llvm-svn: 185714	2013-07-05 18:57:49 +00:00
Arnold Schwaighofer	50b76b5226	ARM: Fix incorrect pack pattern A "pkhtb x, x, y asr #num" uses the lower 16 bits of "y asr #num" and packs them in the bottom half of "x". An arithmetic and logic shift are only equivalent in this context if the shift amount is 16. We would be shifting in ones into the bottom 16bits instead of zeros if "y" is negative. radar://14338767 llvm-svn: 185712	2013-07-05 18:28:39 +00:00
Richard Sandiford	c40f27b52d	[SystemZ] Remove no-op MVCs The stack coloring pass has code to delete stores and loads that become trivially dead after coloring. Extend it to cope with single instructions that copy from one frame index to another. The testcase happens to show an example of this kicking in at the moment. It did occur in Real Code too though. llvm-svn: 185705	2013-07-05 14:38:48 +00:00
Richard Sandiford	b5d9bd6f59	Fix double renaming bug in stack coloring pass The stack coloring pass renumbered frame indexes with a loop of the form: for each frame index FI for each instruction I that uses FI for each use of FI in I rename FI to FI' This caused problems if an instruction used two frame indexes F0 and F1 and if F0 was renamed to F1 and F1 to F2. The first time we visited the instruction we changed F0 to F1, then we changed both F1s to F2. In other words, the problem was that SSRefs recorded which instructions used an FI, but not which MachineOperands and MachineMemOperands within that instruction used it. This is easily fixed for MachineOperands by walking the instructions once and processing each operand in turn. There's already a loop to do that for dead store elimination, so it seemed more efficient to fuse the two at the block level. MachineMemOperands are more tricky because they can be shared between instructions. The patch handles them by making SSRefs an array of MachineMemOperands rather than an array of MachineInstrs. We might end up processing the same MachineMemOperand twice, but that's OK because we always know from the SSRefs index what the original frame index was. llvm-svn: 185703	2013-07-05 14:24:47 +00:00
Richard Sandiford	8976ea72ab	[SystemZ] Enable the use of MVC for frame-to-frame spills ...now that the problem that prompted the restriction has been fixed. The original spill-02.py was a compromise because at the time I couldn't find an example that actually failed without the two scavenging slots. The version included here did. llvm-svn: 185701	2013-07-05 14:02:01 +00:00
Ulrich Weigand	b204431106	[PowerPC] Add some special @got@tprel fixup cases When a target@got@tprel or target@got@tprel@l symbol variant is used in a fixup_ppc_half16 (not fixup_ppc_half16ds) context, we currently fail, since the corresponding R_PPC64_GOT_TPREL16 / R_PPC64_GOT_TPREL16_LO relocation types do not exist. However, since such symbol variants resolve to GOT offsets which are always 4-aligned, we can simply instead use the _DS variants of the relocation types, which do exist. The same applies for the @got@dtprel variants. llvm-svn: 185700	2013-07-05 13:49:46 +00:00
Richard Sandiford	23943229f6	[SystemZ] Allocate a second register scavenging slot This is another prerequisite for frame-to-frame MVC copies. I'll commit the patch that makes use of the slot separately. The downside of trying to test many corner cases with each of the available addressing modes is that a fair few tests need to account for the new frame layout. I do still think it's useful to have all these tests though, since it's something that wouldn't get much coverage otherwise. llvm-svn: 185698	2013-07-05 13:11:52 +00:00
Rafael Espindola	8ef843fc72	Don't create an archive if, for example, we are asked to print the index. llvm-svn: 185697	2013-07-05 13:03:07 +00:00
Ulrich Weigand	5abd12fc32	[PowerPC] Make test case buildable with GNU as The ppc64-fixups.s test currently fails to build with GNU as, since it does not support plain symbols as arguments to li/lis. Rewrite the test for R_PPC64_ADDR16 and R_PPC64_REL16 to use lwz instead. Allowing the test case to be built with both LLVM and GNU as makes it easier to spot unwanted difference in the output. llvm-svn: 185694	2013-07-05 12:33:03 +00:00
Ulrich Weigand	5b427591d6	[PowerPC] Support @tls in the asm parser This adds support for the last missing construct to parse TLS-related assembler code: add 3, 4, symbol@tls The ADD8TLS currently hard-codes the @tls into the assembler string. This cannot be handled by the asm parser, since @tls is parsed as a symbol variant. This patch changes ADD8TLS to have the @tls suffix printed as symbol variant on output too, which allows us to remove the isCodeGenOnly marker from ADD8TLS. This in turn means that we can add a AsmOperand to accept @tls marked symbols on input. As a side effect, this means that the fixup_ppc_tlsreg fixup type is no longer necessary and can be merged into fixup_ppc_nofixup. llvm-svn: 185692	2013-07-05 12:22:36 +00:00
Joey Gouly	606f3fbc2b	PR16490: fix a crash in ARMDAGToDAGISel::SelectInlineAsm. In the SelectionDAG immediate operands to inline asm are constructed as two separate operands. The first is a constant of value InlineAsm::Kind_Imm and the second is a constant with the value of the immediate. In ARMDAGToDAGISel::SelectInlineAsm, if we reach an operand of Kind_Imm we should skip over the next operand too. llvm-svn: 185688	2013-07-05 10:19:40 +00:00
David Majnemer	c2a990bc00	InstCombine: (icmp eq B, 0) \| (icmp ult A, B) -> (icmp ule A, B-1) This transform allows us to turn IR that looks like: %1 = icmp eq i64 %b, 0 %2 = icmp ult i64 %a, %b %3 = or i1 %1, %2 ret i1 %3 into: %0 = add i64 %b, -1 %1 = icmp uge i64 %0, %a ret i1 %1 which means we go from lowering: cmpq %rsi, %rdi setb %cl testq %rsi, %rsi sete %al orb %cl, %al ret to lowering: decq %rsi cmpq %rdi, %rsi setae %al ret llvm-svn: 185677	2013-07-05 00:31:17 +00:00
David Blaikie	9a300bda38	DebugInfo: Consider global variables without locations to be valid We were being a bit too aggresive here in classifying global variables with no global reference or constant value to be invalid - this would cause LLVM to not emit the DWARF description of the global variable if it had been optimized away, which isn't helpful for users who might benefit from the global variable's description even if there's no location information. This also fixes a crasher issue here that I was unable to reduce a test case for - involving a using decl (& subsequent DW_TAG_imported_declaration ) of such a global variable that, once optimized away, would crash when an attempt to emit the imported declaration was made. llvm-svn: 185675	2013-07-04 23:15:18 +00:00
Nico Rieck	1558c5a6ee	MC: Add .section directive to COFF Supports GAS flags "abdnrswxy". No support for alignment or subsections. Fixes PR16366. llvm-svn: 185669	2013-07-04 21:32:07 +00:00
David Majnemer	37f8f445de	InstCombine: Reimplementation of visitUDivOperand This transform was originally added in r185257 but later removed in r185415. The original transform would create instructions speculatively and then discard them if the speculation was proved incorrect. This has been replaced with a scheme that splits the transform into two parts: preflight and fold. While we preflight, we build up fold actions that inform the folding stage on how to act. llvm-svn: 185667	2013-07-04 21:17:49 +00:00
Rafael Espindola	1cbed22836	Add support for archives with no symbol table or string table. llvm-svn: 185664	2013-07-04 19:40:23 +00:00
Ulrich Weigand	d3ac7c058b	[PowerPC] Implement writeNopData This implements a proper PPCAsmBackend::writeNopData routine that actually writes PowerPC nop instructions. This fixes the last remaining difference in object file output (text section) between the integrated assembler and GNU as that I've seen anywhere. llvm-svn: 185662	2013-07-04 18:28:46 +00:00
Rafael Espindola	31c3b2ddee	Add 'not' in front of a command that is expected to fail. llvm-svn: 185659	2013-07-04 17:21:01 +00:00
Joey Gouly	cc4ff9e907	Add support for MC assembling and disassembling of vsel{ge, gt, eq, vs} instructions. This adds a new decoder table/namespace 'VFPV8', as these instructions have their top 4 bits as 0b1111, while other Thumb instructions have 0b1110. llvm-svn: 185642	2013-07-04 14:57:20 +00:00
Ulrich Weigand	56b0e7b011	[PowerPC] Add all trap mnemonics This adds support for all basic and extended variants of the trap instructions to the asm parser. llvm-svn: 185638	2013-07-04 14:40:12 +00:00
Ulrich Weigand	b86cb7d04b	[PowerPC] Add asm parser support for CR expressions This adds support for specifying condition registers and condition register fields via expressions using the symbols defined by the PowerISA, like "4*cr2+eq". llvm-svn: 185633	2013-07-04 14:24:00 +00:00
Benjamin Kramer	371722288c	SimplifyCFG: Teach switch generation some patterns that instcombine forms. This allows us to create switches even if instcombine has munged two of the incombing compares into one and some bit twiddling. This was motivated by enum compares that are common in clang. llvm-svn: 185632	2013-07-04 14:22:02 +00:00
Joey Gouly	39f7488294	Add a V8FP instruction 'vcvt{b,t}' to convert between half and double precision. llvm-svn: 185620	2013-07-04 10:04:08 +00:00
Quentin Colombet	04b3a0fdb2	[ARM] Improve the instruction selection of vector loads. In the ARM back-end, build_vector nodes are lowered to a target specific build_vector that uses floating point type. This works well, unless the inserted bitcasts survive until instruction selection. In that case, they incur moves between integer unit and floating point unit that may result in inefficient code. In other words, this conversion may introduce artificial dependencies when the code leading to the build vector cannot be completed with a floating point type. In particular, this happens when loads are not aligned. Before this patch, in that case, the compiler generates general purpose loads and creates the floating point vector from them, instead of directly using the vector unit. The patch uses a vector friendly sequence of code when the inserted bitcasts to floating point survived DAGCombine. This is done by a target specific DAGCombine that changes the target specific build_vector into a sequence of insert_vector_elt that get rid of the bitcasts. <rdar://problem/14170854> llvm-svn: 185587	2013-07-03 21:42:57 +00:00
Tilmann Scheller	ef5666fbbf	ARM: Prevent ARMAsmParser::shouldOmitCCOutOperand() from misidentifying certain Thumb2 add immediate T3 encodings. Before the fix Thumb2 instructions of type "add rD, rN, #imm" (T3 encoding, see ARM ARM A8.8.4) with rD and rN both being low registers (r0-r7) were classified as having the T4 encoding. The T4 encoding doesn't have a cc_out operand so for above instructions the operand gets erroneously removed, corrupting the token stream and leading to parse errors later in the process. This bug prevented "add r1, r7, #0xcbcbcbcb" from being assembled correctly. Fixes <rdar://problem/14224440>. llvm-svn: 185575	2013-07-03 20:38:01 +00:00
Ulrich Weigand	2542b3b17f	[PowerPC] Support lmw/stmw in the asm parser This adds support for the load/store multiple instructions, currently used by the asm parser only. llvm-svn: 185564	2013-07-03 18:29:47 +00:00
Ulrich Weigand	49f487e6cd	[PowerPC] Use mtocrf when available Just as with mfocrf, it is also preferable to use mtocrf instead of mtcrf when only a single CR register is to be written. Current code however always emits mtcrf. This probably does not matter when using an external assembler, since the GNU assembler will in fact automatically replace mtcrf with mtocrf when possible. It does create inefficient code with the integrated assembler, however. To fix this, this patch adds MTOCRF/MTOCRF8 instruction patterns and uses those instead of MTCRF/MTCRF8 everything. Just as done in the MFOCRF patch committed as 185556, these patterns will be converted back to MTCRF if MTOCRF is not available on the machine. As a side effect, this allows to modify the MTCRF pattern to accept the full range of mask operands for the benefit of the asm parser. llvm-svn: 185561	2013-07-03 17:59:07 +00:00
Rafael Espindola	b0fccb225c	Prefix failing commands with not to make clear they are expected to fail. llvm-svn: 185554	2013-07-03 16:41:29 +00:00
Rafael Espindola	8490bbd16b	Remove another old test. It was only passing because 'grep andpd' was not finding any andpd, but we don't fail if part of a pipe fails. llvm-svn: 185552	2013-07-03 16:35:26 +00:00
Rafael Espindola	447dbc38b6	Remove test for the old EH system. It doesn't parse anymore. llvm-svn: 185551	2013-07-03 16:30:01 +00:00
Rafael Espindola	0bdc4bb684	Fix test: It was missing run lines and llvm-dis has no -disable-verify option. llvm-svn: 185550	2013-07-03 16:27:55 +00:00
Rafael Espindola	88ae7dd230	Add support for gnu archives with a string table and no symtab. While there, use early returns to reduce nesting. llvm-svn: 185547	2013-07-03 15:57:14 +00:00
Rafael Espindola	8b82a4d36e	Make llvm-nm return 1 on error. This is a small compatibility improvement with gnu nm and makes llvm-nm more useful as a testing tool. llvm-svn: 185546	2013-07-03 15:46:03 +00:00
Evgeniy Stepanov	dc6d7eb860	[msan] Unpoison stack allocations and undef values in blacklisted functions. This changes behavior of -msan-poison-stack=0 flag from not poisoning stack allocations to actively unpoisoning them. llvm-svn: 185538	2013-07-03 14:39:14 +00:00
Ulrich Weigand	ae9cf5828c	[PowerPC] Support mtspr/mfspr in the asm parser This adds support for the generic forms of mtspr/mfspr for the asm parser. The compiler will continue to use the specialized patters for mtlr etc. since those are needed to correctly describe data flow. llvm-svn: 185532	2013-07-03 12:32:41 +00:00
Richard Sandiford	ed1fab6b5b	[SystemZ] Fold more spills Add a mapping from register-based <INSN>R instructions to the corresponding memory-based <INSN>. Use it to cut down on the number of spill loads. Some instructions extend their operands from smaller fields, so this required a new TSFlags field to say how big the unextended operand is. This optimisation doesn't trigger for C(G)R and CL(G)R because in practice we always combine those instructions with a branch. Adding a test for every other case probably seems excessive, but it did catch a missed optimisation for DSGF (fixed in r185435). llvm-svn: 185529	2013-07-03 10:10:02 +00:00
Mihai Popa	d36cbaa423	This corrects the implementation of Thumb ADR instruction. There are three issues: 1. it should accept only 4-byte aligned addresses 2. the maximum offset should be 1020 3. it should be encoded with the offset scaled by two bits llvm-svn: 185528	2013-07-03 09:21:44 +00:00
Tim Northover	36b2417f18	ARM: relax the atomic release barrier to "dmb ishst" on Swift Swift cores implement store barriers that are stronger than the ARM specification but weaker than general barriers. They are, in fact, just about enough to provide the ordering needed for atomic operations with release semantics. This patch makes use of that quirk. llvm-svn: 185527	2013-07-03 09:20:36 +00:00
Richard Osborne	a1cff61dec	[XCore] Add ISel pattern for LDWCP Patch by Robert Lytton. llvm-svn: 185518	2013-07-03 07:48:50 +00:00
Michael Gottesman	bed2e82501	Change the gettimeofday test to only test on a posix platform. llvm-svn: 185503	2013-07-03 04:15:22 +00:00
Michael Gottesman	2db11161a8	Added support in FunctionAttrs for adding relevant function/argument attributes for the posix call gettimeofday. This implies annotating it as nounwind and its arguments as nocapture. To be conservative, we do not annotate the arguments with noalias since some platforms do not have restrict on the declaration for gettimeofday. llvm-svn: 185502	2013-07-03 04:00:54 +00:00
Manman Ren	94119ceebb	Trying to fix the bots llvm-svn: 185489	2013-07-03 00:16:11 +00:00
Manman Ren	ac8062bb72	Debug Info: use module flag to set up Dwarf version. Correctly handles ref_addr depending on the Dwarf version. Emit Dwarf with version from module flag. TODO: turn on/off features depending on the Dwarf version. llvm-svn: 185484	2013-07-02 23:40:10 +00:00
Ulrich Weigand	42a09dc12f	[PowerPC] PR16512 - Support TLS call sequences in the asm parser This patch now adds support for recognizing TLS call sequences in the asm parser. This needs a new pattern BL8_TLS, which is like BL8_NOP_TLS except without nop. That pattern is used for the asm parser only. llvm-svn: 185478	2013-07-02 21:31:59 +00:00
Ulrich Weigand	4050995650	[PowerPC] Remove VK_PPC_TLSGD and VK_PPC_TLSLD The PowerPC-specific modifiers VK_PPC_TLSGD and VK_PPC_TLSLD correspond exactly to the generic modifiers VK_TLSGD and VK_TLSLD. This causes some confusion with the asm parser, since VK_PPC_TLSGD is output as @tlsgd, which is then read back in as VK_TLSGD. To avoid this confusion, this patch removes the PowerPC-specific modifiers and uses the generic modifiers throughout. (The only drawback is that the generic modifiers are printed in upper case while the usual convention on PowerPC is to use lower-case modifiers. But this is just a cosmetic issue.) llvm-svn: 185476	2013-07-02 21:29:06 +00:00
Jyotsna Verma	ddca5fa24a	Add 'REQUIRES: object-emission' to DebugInfo/inlined-arguments.ll. llvm-svn: 185465	2013-07-02 19:21:43 +00:00
Ulrich Weigand	0f0398246c	[PowerPC] Support TLS variables in debug info This adds an implementation of getDebugThreadLocalSymbol for (64-bit) PowerPC. This needs to return a generic MCExpr since on ppc64, we need to add a bias of 0x8000 to the value returned by the R_PPC64_DTPREL64 relocation. llvm-svn: 185461	2013-07-02 18:47:35 +00:00
Richard Sandiford	e6e7885591	[SystemZ] Use DSGFR over DSGR in more cases Fixes some cases where we were using full 64-bit division for (sdiv i32, i32) and (sdiv i64, i32). The "32" in "SDIVREM32" just refers to the second operand. The first operand of all DIVREMs is a GR128. llvm-svn: 185435	2013-07-02 15:40:22 +00:00
Richard Sandiford	f6bae1e434	[SystemZ] Use MVC to spill loads and stores Try to use MVC when spilling the destination of a simple load or the source of a simple store. As explained in the comment, this doesn't yet handle the case where the load or store location is also a frame index, since that could lead to two simultaneous scavenger spills, something the backend can't handle yet. spill-02.py tests that this restriction kicks in, but unfortunately I've not yet found a case that would fail without it. The volatile trick I used for other scavenger tests doesn't work here because we can't use MVC for volatile accesses anyway. I'm planning on relaxing the restriction later, hopefully with a test that does trigger the problem... Tests @f8 and @f9 also showed that L(G)RL and ST(G)RL were wrongly classified as SimpleBDX{Load,Store}. It wouldn't be easy to test for that bug separately, which is why I didn't split out the fix as a separate patch. llvm-svn: 185434	2013-07-02 15:28:56 +00:00
Richard Sandiford	1d959008d6	[SystemZ] Add the MVC instruction This is the first use of D(L,B) addressing, which required a fair bit of surgery. For that reason, the patch just adds the instruction definition and the associated assembler and disassembler support. A later patch will actually make use of it for codegen. llvm-svn: 185433	2013-07-02 14:56:45 +00:00
Richard Osborne	e4cc98686a	[XCore] Fix instruction selection for zext, mkmsk instructions. r182680 replaced CountLeadingZeros_32 with a template function countLeadingZeros that relies on using the correct argument type to give the right result. The type passed in the XCore backend after this revision was incorrect in a couple of places. Patch by Robert Lytton. llvm-svn: 185430	2013-07-02 14:46:34 +00:00
Logan Chien	c931fce404	Fix ARM EHABI compact model 1 and 2 without handlerdata. According to ARM EHABI section 9.2, if the __aeabi_unwind_cpp_pr1() or __aeabi_unwind_cpp_pr2() is used, then the handler data must be emitted after the unwind opcodes. The handler data consists of several words, and should be terminated by zero. In case that the .handlerdata directive is not specified by the programmer, we should emit zero to terminate the handler data. llvm-svn: 185422	2013-07-02 12:43:27 +00:00
Tim Northover	6823900e55	DAGCombiner: fix use-counting issue when forming zextload DAGCombiner was counting all uses of a load node when considering whether it's worth combining into a zextload. Really, it wants to ignore the chain and just count real uses. rdar://problem/13896307 llvm-svn: 185419	2013-07-02 09:58:53 +00:00
Hal Finkel	fdbe161b1a	Revert r185257 (InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms) I'm reverting this commit because: 1. As discussed during review, it needs to be rewritten (to avoid creating and then deleting instructions). 2. This is causing optimizer crashes. Specifically, I'm seeing things like this: While deleting: i1 % Use still stuck around after Def is destroyed: <badref> = select i1 <badref>, i32 0, i32 1 opt: /src/llvm-trunk/lib/IR/Value.cpp:79: virtual llvm::Value::~Value(): Assertion `use_empty() && "Uses remain when a value is destroyed!"' failed. I'd guess that these will go away once we're no longer creating/deleting instructions here, but just in case, I'm adding a regression test. Because the code is bring rewritten, I've just XFAIL'd the original regression test. Original commit message: InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms Real world code sometimes has the denominator of a 'udiv' be a 'select'. LLVM can handle such cases but only when the 'select' operands are symmetric in structure (both select operands are a constant power of two or a left shift, etc.). This falls apart if we are dealt a 'udiv' where the code is not symetric or if the select operands lead us to more select instructions. Instead, we should treat the LHS and each select operand as a distinct divide operation and try to optimize them independently. If we can to simplify each operation, then we can replace the 'udiv' with, say, a 'lshr' that has a new select with a bunch of new operands for the select. llvm-svn: 185415	2013-07-02 05:21:11 +00:00
Hal Finkel	52727c6b82	Cleanup PPC Altivec registers in CSR lists and improve VRSAVE handling There are a couple of (small) related changes here: 1. The printed name of the VRSAVE register has been changed from VRsave to vrsave in order to match the name accepted by GNU binutils. 2. Support for parsing vrsave has been added to the asm parser (it seems that there was no test case specifically covering this code, so I've added one). 3. The list of Altivec registers, which was common to all calling conventions, has been separated out. This allows us to define the base CSR lists, and then lists for each ABI with Altivec included. This allows SjLj, for example, to work correctly on non-Altivec targets without using unnatural definitions of the NoRegs CSR list. 4. VRSAVE is now always reserved on non-Darwin targets and all Altivec registers are reserved when Altivec is disabled. With these changes, it is now possible to compile a function containing __builtin_unwind_init() on Linux/PPC64 with debugging information. This did not work previously because GNU binutils assumes that all .cfi_offset offsets will be 8-byte aligned on PPC64 (and errors out if you provide a non-8-byte-aligned offset). This is not true for the vrsave register, however, because this register is used only on Darwin, GCC does not bother printing a .cfi_offset entry for it (even though there is a slot in the stack frame for it as specified by the ABI). This change allows us to do the same: we will also not print .cfi_offset directives for vrsave. llvm-svn: 185409	2013-07-02 03:39:34 +00:00
David Blaikie	8466ca86fe	PR14728: DebugInfo: TLS variables with -gsplit-dwarf llvm-svn: 185398	2013-07-01 23:55:52 +00:00
Ulrich Weigand	f11efe7f48	[PowerPC] Add support for TLS data relocations This adds support for TLS data relocations and modifiers: .quad target@dtpmod .quad target@tprel .quad target@dtprel Currently exploited by the asm parser only. llvm-svn: 185394	2013-07-01 23:33:29 +00:00
David Blaikie	1b01ae8648	PR16493: DebugInfo with TLS on PPC crashing due to invalid relocation Restrict the current TLS support to X86 ELF for now. Test that we don't produce it on PPC & we can flesh that test case out with the right thing once someone implements it. llvm-svn: 185389	2013-07-01 21:45:25 +00:00
Ulrich Weigand	85c6f7f7a7	[PowerPC] Support all condition register logical instructions This adds support for all missing condition register logical instructions and extended mnemonics to the asm parser. llvm-svn: 185387	2013-07-01 21:40:54 +00:00
Bill Schmidt	48fc20a034	Index: test/CodeGen/PowerPC/reloc-align.ll =================================================================== --- test/CodeGen/PowerPC/reloc-align.ll (revision 0) +++ test/CodeGen/PowerPC/reloc-align.ll (revision 0) @@ -0,0 +1,34 @@ +; RUN: llc -mcpu=pwr7 -O1 < %s \| FileCheck %s + +; This test verifies that the peephole optimization of address accesses +; does not produce a load or store with a relocation that can't be +; satisfied for a given instruction encoding. Reduced from a test supplied +; by Hal Finkel. + +target datalayout = "E-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f128:128:128-v128:128:128-n32:64" +target triple = "powerpc64-unknown-linux-gnu" + +%struct.S1 = type { [8 x i8] } + +@main.l_1554 = internal global { i8, i8, i8, i8, i8, i8, i8, i8 } { i8 -1, i8 -6, i8 57, i8 62, i8 -48, i8 0, i8 58, i8 80 }, align 1 + +; Function Attrs: nounwind readonly +define signext i32 @main() #0 { +entry: + %call = tail call fastcc signext i32 @func_90(%struct.S1* byval bitcast ({ i8, i8, i8, i8, i8, i8, i8, i8 }* @main.l_1554 to %struct.S1)) +; CHECK-NOT: ld {{[0-9]+}}, main.l_1554@toc@l + ret i32 %call +} + +; Function Attrs: nounwind readonly +define internal fastcc signext i32 @func_90(%struct.S1 byval nocapture %p_91) #0 { +entry: + %0 = bitcast %struct.S1* %p_91 to i64* + %bf.load = load i64* %0, align 1 + %bf.shl = shl i64 %bf.load, 26 + %bf.ashr = ashr i64 %bf.shl, 54 + %bf.cast = trunc i64 %bf.ashr to i32 + ret i32 %bf.cast +} + +attributes #0 = { nounwind readonly "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"="true" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "unsafe-fp-math"="false" "use-soft-float"="false" } Index: lib/Target/PowerPC/PPCAsmPrinter.cpp =================================================================== --- lib/Target/PowerPC/PPCAsmPrinter.cpp (revision 185327) +++ lib/Target/PowerPC/PPCAsmPrinter.cpp (working copy) @@ -679,7 +679,26 @@ void PPCAsmPrinter::EmitInstruction(const MachineI OutStreamer.EmitRawText(StringRef("\tmsync")); return; } + break; + case PPC::LD: + case PPC::STD: + case PPC::LWA: { + // Verify alignment is legal, so we don't create relocations + // that can't be supported. + // FIXME: This test is currently disabled for Darwin. The test + // suite shows a handful of test cases that fail this check for + // Darwin. Those need to be investigated before this sanity test + // can be enabled for those subtargets. + if (!Subtarget.isDarwin()) { + unsigned OpNum = (MI->getOpcode() == PPC::STD) ? 2 : 1; + const MachineOperand &MO = MI->getOperand(OpNum); + if (MO.isGlobal() && MO.getGlobal()->getAlignment() < 4) + llvm_unreachable("Global must be word-aligned for LD, STD, LWA!"); + } + // Now process the instruction normally. + break; } + } LowerPPCMachineInstrToMCInst(MI, TmpInst, this); OutStreamer.EmitInstruction(TmpInst); Index: lib/Target/PowerPC/PPCISelDAGToDAG.cpp =================================================================== --- lib/Target/PowerPC/PPCISelDAGToDAG.cpp (revision 185327) +++ lib/Target/PowerPC/PPCISelDAGToDAG.cpp (working copy) @@ -1530,6 +1530,14 @@ void PPCDAGToDAGISel::PostprocessISelDAG() { if (GlobalAddressSDNode GA = dyn_cast<GlobalAddressSDNode>(ImmOpnd)) { SDLoc dl(GA); const GlobalValue GV = GA->getGlobal(); + // We can't perform this optimization for data whose alignment + // is insufficient for the instruction encoding. + if (GV->getAlignment() < 4 && + (StorageOpcode == PPC::LD \|\| StorageOpcode == PPC::STD \|\| + StorageOpcode == PPC::LWA)) { + DEBUG(dbgs() << "Rejected this candidate for alignment.\n\n"); + continue; + } ImmOpnd = CurDAG->getTargetGlobalAddress(GV, dl, MVT::i64, 0, Flags); } else if (ConstantPoolSDNode CP = dyn_cast<ConstantPoolSDNode>(ImmOpnd)) { llvm-svn: 185380	2013-07-01 20:52:27 +00:00
Chad Rosier	fa705ee36c	[ARMAsmParser] Sort the ARM register lists based on the encoding value, not the tablegen enum values. This should be the last fix due to fallout from r185094. llvm-svn: 185379	2013-07-01 20:49:23 +00:00
Ulrich Weigand	f7152a8596	[PowerPC] Also add "msync" alias This adds an alias for "msync" (which is used on Book E systems instead of "sync"). llvm-svn: 185375	2013-07-01 20:39:50 +00:00
Akira Hatanaka	263c6af8f3	[mips] Increase the number of floating point control registers available to 32. Create a dedicated register class for floating point condition code registers and move FCC0 from register class CCR to the new register class. llvm-svn: 185373	2013-07-01 20:31:44 +00:00
Akira Hatanaka	8b5b1e072f	[mips] Fix test case to check that mips64 instructions are generated. llvm-svn: 185371	2013-07-01 20:18:58 +00:00
Anton Korobeynikov	ba8f4c5e29	Really fix the test. Sorry for the breakage... llvm-svn: 185369	2013-07-01 19:51:36 +00:00
Anton Korobeynikov	0267837076	Fix the test which relies on uncommitted change llvm-svn: 185368	2013-07-01 19:50:31 +00:00
Anton Korobeynikov	82bedb1f3b	Add jump tables handling for MSP430. Patch by Job Noorman! llvm-svn: 185364	2013-07-01 19:44:44 +00:00
Cameron Zwarich	867bfcd546	Fix PR16508. When phis get lowered, destination copies are inserted using an iterator that is determined once for all phis in the block, which BuildMI interprets as a request to insert an instruction directly before the iterator. In the case of a cyclic phi, source copies may also be inserted directly before this iterator, which can cause source copies to be inserted before destination copies. The fix is to keep an iterator to the last phi and then advance it while lowering each phi in order to insert destination copies directly after the phis. llvm-svn: 185363	2013-07-01 19:42:46 +00:00
Hal Finkel	25e4a0d418	Don't form PPC CTR loops for over-sized exit counts Although you can't generate this from C on PPC64, if you have a loop using a 64-bit counter on PPC32 then you can't form a CTR-based loop for it. This had been cauing the PPCCTRLoops pass to assert. Thanks to Joerg Sonnenberger for providing a test case! llvm-svn: 185361	2013-07-01 19:34:59 +00:00
Tim Northover	8625fd8cad	AArch64: correct CodeGen of MOVZ/MOVK combinations. According to the AArch64 ELF specification (4.6.8), it's the assembler's responsibility to make sure the shift amount is correct in relocated MOVZ/MOVK instructions. This wasn't being obeyed by either the MCJIT CodeGen or RuntimeDyldELF (which happened to work out well for JIT tests). This commit should make us compliant in this area. llvm-svn: 185360	2013-07-01 19:23:10 +00:00
Matt Beaumont-Gay	8b30c13e12	(1) Add ".test" to test/Other/lit.local.cfg, so llvm-cov.test is actually run. (2) Rename llvm-cov test inputs so the string "llvm-cov" doesn't get substituted by lit within the input filenames on the RUN line. (3) XFAIL llvm-cov.test because it asserts: include/llvm/ADT/SmallVector.h:140: reference llvm::SmallVectorTemplateCommon<llvm::GCOVBlock , void>::operator[](unsigned int) [T = llvm::GCOVBlock ]: Assertion `begin() + idx < end()' failed. llvm-svn: 185358	2013-07-01 18:58:53 +00:00
Tim Northover	7f3d9e1f36	Revert r185339 (ARM: relax the atomic release barrier to "dmb ishst") Turns out I'd misread the architecture reference manual and thought that was a load/store-store barrier, when it's not. Thanks for pointing it out Eli! llvm-svn: 185356	2013-07-01 18:37:33 +00:00
Ulrich Weigand	3a75861b06	[PowerPC] Fix @got references to local symbols A @got reference must always result in a relocation, so that the linker has a chance to set up the GOT entry, even if the symbol happens to be local. Add a PPCELFObjectWriter::ExplicitRelSym routine that enforces a relocation to be emitted for GOT references. llvm-svn: 185353	2013-07-01 18:19:56 +00:00
Ulrich Weigand	7a9fcdf6fb	[PowerPC] Add "wait" instruction This adds the "wait" instruction and its extended mnemonics. llvm-svn: 185350	2013-07-01 17:21:23 +00:00
Ulrich Weigand	98fcc7b6bc	[PowerPC] Support "eieio" instruction This adds support for the "eieio" instruction to the asm parser. llvm-svn: 185349	2013-07-01 17:06:26 +00:00
Ulrich Weigand	421843229c	[PowerPC] Add some existing instructions to ppc64-encoding-bookII.s The test case had a couple of FIXMEs where the instruction is in fact already supported by the back-end. In some other case, while the generic form of the instruction is not yet supported, a specialized form is. This adds tests for those already supported instructions / instruction forms. llvm-svn: 185347	2013-07-01 16:52:55 +00:00
Ulrich Weigand	797f1a3f5b	[PowerPC] Add variants of "sync" instruction This adds support for the "sync $L" instruction with operand, and provides aliases for "lwsync" and "ptesync". llvm-svn: 185344	2013-07-01 16:37:52 +00:00
Tim Northover	953abab40a	ARM: relax the atomic release barrier to "dmb ishst" I believe the full "dmb ish" barrier is not required to guarantee release semantics for atomic operations. The weaker "dmb ishst" prevents previous operations being reordered with a store executed afterwards, which is enough. A key point to note (fortunately already correct) is that this barrier alone is insufficient for sequential consistency, no matter how liberally placed. llvm-svn: 185339	2013-07-01 14:48:48 +00:00
Justin Holewinski	d2bbdf05e0	[NVPTX] Add support for module-scope inline asm Since we were explicitly not calling AsmPrinter::doInitialization, any module-scope inline asm was not being printed. llvm-svn: 185336	2013-07-01 13:00:14 +00:00
Justin Holewinski	51cb1349dc	[NVPTX] 64-bit ADDC/ADDE are not legal llvm-svn: 185333	2013-07-01 12:59:04 +00:00
Justin Holewinski	dff28d215f	[NVPTX] Fix vector loads from parameters that span multiple loads, and fix some typos llvm-svn: 185332	2013-07-01 12:59:01 +00:00
Justin Holewinski	a2911283e4	[NVPTX] Handle signext/zeroext attributes properly Fix a case where we were incorrectly sign-extending a value when we should have been zero-extending the value. Also change some SIGN_EXTEND to ANY_EXTEND because we really dont care and may have more opportunity to fold subexpressions llvm-svn: 185331	2013-07-01 12:58:58 +00:00
Justin Holewinski	318c625ff4	[NVPTX] Add support for native SIGN_EXTEND_INREG where available llvm-svn: 185330	2013-07-01 12:58:56 +00:00
Justin Holewinski	e40e929eb1	[NVPTX] Add isel patterns for [reg+offset] form of ldg/ldu. llvm-svn: 185329	2013-07-01 12:58:52 +00:00
Justin Holewinski	e8c93e3378	[NVPTX] Make sure we zero out high-order 24 bits for 8-bit load into 32-bit value llvm-svn: 185328	2013-07-01 12:58:48 +00:00
NAKAMURA Takumi	234acdfdc8	llvm-symbolizer: Recognize a drive letter on win32. Then "REQUIRES: shell" can be removed. FIXME: Could we use llvm::sys::Path here? llvm-svn: 185322	2013-07-01 09:51:42 +00:00
Serge Pavlov	ff9a65c6a6	Added the test missed from r185080. llvm-svn: 185316	2013-07-01 09:02:33 +00:00
Arnold Schwaighofer	ef51cf202b	LoopVectorize: Math functions only read rounding mode Math functions are mark as readonly because they read the floating point rounding mode. Because we don't vectorize loops that would contain function calls that set the rounding mode it is safe to ignore this memory read. llvm-svn: 185299	2013-07-01 00:54:44 +00:00
Stephen Lin	2e551adcd9	DeadArgumentElimination: keep return value on functions that have a live argument with the 'returned' attribute (rather than generate invalid IR); however, if both can be eliminated, both will be llvm-svn: 185290	2013-06-30 20:26:21 +00:00
Benjamin Kramer	cc846016bf	ConstantFold: Check that truncating the other side is safe under a sext when trying to remove a sext from a compare. Fixes PR16462. llvm-svn: 185284	2013-06-30 13:47:43 +00:00
David Majnemer	7a69d2c06a	ValueTracking: Teach isKnownToBeAPowerOfTwo about (ADD X, (XOR X, Y)) where X is a power of two This allows us to simplify urem instructions involving the add+xor to turn into simpler math. llvm-svn: 185272	2013-06-29 23:44:53 +00:00
Benjamin Kramer	4093f29366	InstCombine: Also turn selects fed by an and into arithmetic when the types don't match. Inserting a zext or trunc is sufficient. This pattern is somewhat common in LLVM's pointer mangling code. llvm-svn: 185270	2013-06-29 21:17:04 +00:00
Vincent Lejeune	77a8352476	R600: Support schedule and packetization of trans-only inst llvm-svn: 185268	2013-06-29 19:32:43 +00:00
David Majnemer	5953d3712a	InstCombine: FoldGEPICmp shouldn't change sign of base pointer comparison Changing the sign when comparing the base pointer would introduce all sorts of unexpected things like: %gep.i = getelementptr inbounds [1 x i8]* %a, i32 0, i32 0 %gep2.i = getelementptr inbounds [1 x i8]* %b, i32 0, i32 0 %cmp.i = icmp ult i8* %gep.i, %gep2.i %cmp.i1 = icmp ult [1 x i8]* %a, %b %cmp = icmp ne i1 %cmp.i, %cmp.i1 ret i1 %cmp into: %cmp.i = icmp slt [1 x i8]* %a, %b %cmp.i1 = icmp ult [1 x i8]* %a, %b %cmp = xor i1 %cmp.i, %cmp.i1 ret i1 %cmp By preserving the original sign, we now get: ret i1 false This fixes PR16483. llvm-svn: 185259	2013-06-29 10:28:04 +00:00
David Majnemer	797227eea6	InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms Real world code sometimes has the denominator of a 'udiv' be a 'select'. LLVM can handle such cases but only when the 'select' operands are symmetric in structure (both select operands are a constant power of two or a left shift, etc.). This falls apart if we are dealt a 'udiv' where the code is not symetric or if the select operands lead us to more select instructions. Instead, we should treat the LHS and each select operand as a distinct divide operation and try to optimize them independently. If we can to simplify each operation, then we can replace the 'udiv' with, say, a 'lshr' that has a new select with a bunch of new operands for the select. llvm-svn: 185257	2013-06-29 08:40:07 +00:00
David Majnemer	b889e405eb	InstCombine: Optimize (1 << X) Pred CstP2 to X Pred Log2(CstP2) We may, after other optimizations, find ourselves with IR that looks like: %shl = shl i32 1, %y %cmp = icmp ult i32 %shl, 32 Instead, we should just compare the shift count: %cmp = icmp ult i32 %y, 5 llvm-svn: 185242	2013-06-28 23:42:03 +00:00
Jakob Stoklund Olesen	0b075103cd	Minimize precision loss when computing cyclic probabilities. Allow block frequencies to exceed 32 bits by using the new BlockFrequency division function. llvm-svn: 185236	2013-06-28 22:40:43 +00:00
Hal Finkel	ac1a24b508	PPC: Ignore spill/restore requests for VRSAVE (except on Darwin) This fixes PR16418, which reports that a function calling __builtin_unwind_init() asserts. The cause is that this generates a spill/restore for VRSAVE, and we support that only on Darwin (because VRSAVE is only really used on Darwin). The test case checks only that we don't crash. We can add correctness checks once someone verifies what behavior the function is supposed to have. llvm-svn: 185235	2013-06-28 22:29:56 +00:00
Nadav Rotem	060be733a5	SLP Vectorizer: Add support for trees with external users. To support this we have to insert 'extractelement' instructions to pick the right lane. We had this functionality before but I removed it when we moved to the multi-block design because it was too complicated. llvm-svn: 185230	2013-06-28 22:07:09 +00:00

1 2 3 4 5 ...

19916 Commits