llvm-project

Commit Graph

Author	SHA1	Message	Date
Tim Northover	a2292d0b8f	ARM: diagnose ARM/Thumb assembly switches on CPUs only supporting one. Some ARM CPUs only support ARM mode (ancient v4 ones, for example) and some only support Thumb mode (M-class ones currently). This makes sure such CPUs default to the correct mode and makes the AsmParser diagnose an attempt to switch modes incorrectly. rdar://14024354 llvm-svn: 183710	2013-06-10 23:20:58 +00:00
Tim Northover	6833e3fd75	X86: Stop LEA64_32r doing unspeakable things to its arguments. Previously LEA64_32r went through virtually the entire backend thinking it was using 32-bit registers until its blissful illusions were cruelly snatched away by MCInstLower and 64-bit equivalents were substituted at the last minute. This patch makes it behave normally, and take 64-bit registers as sources all the way through. Previous uses (for 32-bit arithmetic) are accommodated via SUBREG_TO_REG instructions which make the types and classes agree properly. llvm-svn: 183693	2013-06-10 20:43:49 +00:00
Ulrich Weigand	4c44032aa1	[PowerPC] Support extended sc mnemonic A plain "sc" without argument is supposed to be treated like "sc 0" by the assembler. This patch adds a corresponding alias. Problem reported by Joerg Sonnenberger. llvm-svn: 183687	2013-06-10 17:19:43 +00:00
Ulrich Weigand	aa4a2d71aa	[PowerPC] Support branch mnemonics with implied CR0 The extended branch mnemonics are supposed to use an implied CR0 if there is no explicit condition register specified. This patch adds extra variants of the mnemonics to this effect. Problem reported by Joerg Sonnenberger. llvm-svn: 183686	2013-06-10 17:19:15 +00:00
Amaury de la Vieuville	43cb13a5c9	ARM: ISB cannot be passed the same options as DMB ISB should only accepts full system sync, other options are reserved llvm-svn: 183656	2013-06-10 14:17:08 +00:00
Justin Holewinski	b96d1395f6	[NVPTX] Remove old CONST_NOT_GEN address space that is not being used anymore and causes constants to be emitted in the global address space llvm-svn: 183652	2013-06-10 13:29:47 +00:00
JF Bastien	c04341bb34	Add test for ARM FastISel load/store register classes r183624 fixed an issue that was tested indirectly. Test it directly with this new test. llvm-svn: 183634	2013-06-10 00:35:57 +00:00
Reed Kotler	ce510830c5	Fix a regression I introduced when I expanded the complex pseudos in the Mips16 port. A few of the psuedos could either take signed or unsigned arguments and I did not distinguish the case and improperly rejected some valid cases that the assembler had previously accepted when they were pure pseudos that expanded as assembly instructions. llvm-svn: 183633	2013-06-09 23:23:46 +00:00
Logan Chien	1bd6e13b70	Refine the ARM EHABI test cases. Since we have ARM unwind directive parser and assembler, we can check the correctness in two stages: 1. From LLVM assembly (.ll) to ARM assembly (.s) 2. From ARM assembly (.s) to ELF object file (.o) We already have several ".s to .o" test cases. This CL adds some ".ll to .s" test cases and removes the redundant ".ll to .o" test cases. New test cases to check ".ll to .s" code generator: - ehabi.ll: Check the correctness of the generated unwind directives. - section-name.ll: Check the section name of functions. Removed test cases: - ehabi-mc-cantunwind.ll (Covered by ehabi-cantunwind.ll, and eh-directive-cantunwind.s) - ehabi-mc-compact-pr0.ll (Covered by ehabi.ll, eh-compact-pr0.s, eh-directive-save.s, and eh-directive-setfp.s) - ehabi-mc-compact-pr1.ll (Covered by ehabi.ll, eh-compact-pr1.s, eh-directive-save.s, and eh-directive-setfp.s) - ehabi-mc.ll (Covered by ehabi.ll, and eh-directive-integrated-test.s) - ehabi-mc-section-group.ll (Covered by section-name.ll, and eh-directive-section-comdat.s) - ehabi-mc-section.ll (Covered by section-name.ll, and eh-directive-section.s) - ehabi-mc-sh_link.ll (Covered by eh-directive-text-section.s, and eh-directive-section.s) llvm-svn: 183628	2013-06-09 12:36:57 +00:00
Logan Chien	325823a189	Fix ARM unwind opcode assembler in several cases. Changes to ARM unwind opcode assembler: * Fix multiple .save or .vsave directives. Besides, the order is preserved now. * For the directives which will generate multiple opcodes, such as ".save {r0-r11}", the order of the unwind opcode is fixed now, i.e. the registers with less encoding value are popped first. * Fix the $sp offset calculation. Now, we can use the .setfp, .pad, .save, and .vsave directives at any order. Changes to test cases: * Add test cases to check the order of multiple opcodes for the .save directive. * Fix the incorrect $sp offset in the test case. The stack pointer offset specified in the test case was incorrect. (Changed test cases: ehabi-mc-section.ll and ehabi-mc.ll) * The opcode to restore $sp are slightly reordered. The behavior are not changed, and the new output is same as the output of GNU as. (Changed test cases: eh-directive-pad.s and eh-directive-setfp.s) llvm-svn: 183627	2013-06-09 12:22:30 +00:00
Elena Demikhovsky	89703c06f2	Removed PackedDouble domain from scalar instructions. Added more formats for the scalar stuff. llvm-svn: 183626	2013-06-09 07:37:10 +00:00
Tim Northover	64280fbba1	Make DeadArgumentElimination more conservative on variadic functions Variadic functions are particularly fragile in the face of ABI changes, so this limits how much the pass changes them llvm-svn: 183625	2013-06-09 02:17:27 +00:00
Venkatraman Govindaraju	7dae9ce021	[Sparc] Delete FPMover Pass and remove Fp* Pseudo-instructions from Sparc backend. llvm-svn: 183613	2013-06-08 15:32:59 +00:00
Amaury de la Vieuville	f4ec0c8510	ARM: fix VMOVvnf32 decoding when ambiguous with VCVT Enforce Table A7-15 (op=1, cmode=0b111) -> UNDEF llvm-svn: 183612	2013-06-08 13:54:05 +00:00
Amaury de la Vieuville	68bcd021fd	ARM: enforce SRS decoding constraints llvm-svn: 183611	2013-06-08 13:43:59 +00:00
Amaury de la Vieuville	631df63e54	ARM: fix CPS decoding when ambiguous with QADD Handle the case when the disassembler table can't tell the difference between some encodings of QADD and CPS. Add some necessary safe guards in CPS decoding as well. llvm-svn: 183610	2013-06-08 13:38:52 +00:00
Amaury de la Vieuville	ea7bb57058	ARM: fix VCVT decoding UNPRED was reported instead of UNDEF llvm-svn: 183608	2013-06-08 13:29:11 +00:00
Shuxin Yang	140d592d84	Fix a potential bug in r183584. r183584 tries to derive some info from the code AFTER a call and apply these derived info to the code BEFORE the call, which is not always safe as the call in question may never return, and in this case, the derived info is invalid. Thank Duncan for pointing out this potential bug. rdar://14073661 llvm-svn: 183606	2013-06-08 04:56:05 +00:00
Quentin Colombet	249cb6756c	Reapply r183552. This time, use a standard type for the option to avoid template instantiation issue with non-standard type. Add a backend option to warn on a given stack size limit. Option: -mllvm -warn-stack-size=<limit> Output (if limit is exceeded): warning: Stack size limit exceeded (<actual size>) in <functionName>. The longer term plan is to hook that to a clang warning. PR:4072 <rdar://problem/13987214>. llvm-svn: 183595	2013-06-08 00:07:54 +00:00
Vincent Lejeune	4d143328df	R600: Anti dep better handled in tex clause llvm-svn: 183592	2013-06-07 23:30:26 +00:00
Jakob Stoklund Olesen	9f812b97ba	Add missing zextloadi1 to i64 patterns. PR16721. llvm-svn: 183587	2013-06-07 22:55:05 +00:00
Shuxin Yang	bd254f2601	Fix an assertion in MemCpyOpt pass. The MemCpyOpt pass is capable of optimizing: callee(&S); copy N bytes from S to D. into: callee(&D); subject to some legality constraints. Assertion is triggered when the compiler tries to evalute "sizeof(typeof(D))", while D is an opaque-typed, 'sret' formal argument of function being compiled. i.e. the signature of the func being compiled is something like this: T caller(...,%opaque* noalias nocapture sret %D, ...) The fix is that when come across such situation, instead of calling some utility functions to get the size of D's type (which will crash), we simply assume D has at least N bytes as implified by the copy-instruction. rdar://14073661 llvm-svn: 183584	2013-06-07 22:45:21 +00:00
Hal Finkel	fa5f6f7440	Disallow i64 div/rem in PPC32 counter loops On PPC32, [su]div,rem on i64 types are transformed into runtime library function calls. As a result, they are not allowed in counter-based loops (the counter-loops verification pass caught this error; this change fixes PR16169). llvm-svn: 183581	2013-06-07 22:16:19 +00:00
Quentin Colombet	bd5a201c85	Revert commits related to stack warning. llvm-svn: 183579	2013-06-07 22:14:50 +00:00
Quentin Colombet	9b08a0df1c	Explicit triple in warn stack size test cases to not depend on OS. llvm-svn: 183574	2013-06-07 21:09:42 +00:00
Tom Stellard	d74583777f	R600: Fix calculation of stack offset in AMDGPUFrameLowering We weren't computing structure size correctly and we were relying on the original alloca instruction to compute the offset, which isn't always reliable. Reviewed-by: Vincent Lejeune <vljn@ovi.com> llvm-svn: 183568	2013-06-07 20:52:05 +00:00
Tom Stellard	3498e4ff1d	R600: Fix the fetch limits for R600 generation GPUs Reviewed-by: Vincent Lejeune <vljn@ovi.com> https://bugs.freedesktop.org/show_bug.cgi?id=64257 llvm-svn: 183560	2013-06-07 20:28:55 +00:00
Quentin Colombet	6baf581b93	Add a backend option to warn on a given stack size limit. Option: -mllvm -warn-stack-size=<limit> Output (if limit is exceeded): warning: Stack size limit exceeded (<actual size>) in <functionName>. The longer term plan is to hook that to a clang warning. PR:4072 <rdar://problem/13987214> llvm-svn: 183552	2013-06-07 20:18:12 +00:00
JF Bastien	06ce03d141	ARM FastISel integer sext/zext improvements My recent ARM FastISel patch exposed this bug: http://llvm.org/bugs/show_bug.cgi?id=16178 The root cause is that it can't select integer sext/zext pre-ARMv6 and asserts out. The current integer sext/zext code doesn't handle other cases gracefully either, so this patch makes it handle all sext and zext from i1/i8/i16 to i8/i16/i32, with and without ARMv6, both in Thumb and ARM mode. This should fix the bug as well as make FastISel faster because it bails to SelectionDAG less often. See fastisel-ext.patch for this. fastisel-ext-tests.patch changes current tests to always use reg-imm AND for 8-bit zext instead of UXTB. This simplifies code since it is supported on ARMv4t and later, and at least on A15 both should perform exactly the same (both have exec 1 uop 1, type I). 2013-05-31-char-shift-crash.ll is a bitcode version of the above bug 16178 repro. fast-isel-ext.ll tests all sext/zext combinations that ARM FastISel should now handle. Note that my ARM FastISel enabling patch was reverted due to a separate failure when dealing with MCJIT, I'll fix this second failure and then turn FastISel on again for non-iOS ARM targets. I've tested "make check-all" on my x86 box, and "lnt test-suite" on A15 hardware. llvm-svn: 183551	2013-06-07 20:10:37 +00:00
Quentin Colombet	ba366011c8	Teach AsmPrinter how to print odd constants. Fix an assertion when the compiler encounters big constants whose bit width is not a multiple of 64-bits. Although clang would never generate something like this, the backend should be able to handle any legal IR. <rdar://problem/13363576> llvm-svn: 183544	2013-06-07 18:36:03 +00:00
Roman Divacky	158d8069ad	Fix a typo in asm string of BP* family of instructions. With this fix I am able to compile/assemble/link/run /bin/echo from FreeBSD. llvm-svn: 183537	2013-06-07 17:46:57 +00:00
Rafael Espindola	aad6c24422	Support OpenBSD's native frame protection conventions. OpenBSD's stack smashing protection differs slightly from other platforms: 1. The smash handler function is "__stack_smash_handler(const char *funcname)" instead of "__stack_chk_fail(void)". 2. There's a hidden "long __guard_local" object that gets linked into each executable and DSO. Patch by Matthew Dempsky. llvm-svn: 183533	2013-06-07 16:35:57 +00:00
Michael Gottesman	9e7261c874	[objc-arc] Ensure that the cfg path count does not overflow when we multiply TopDownPathCount/BottomUpPathCount. rdar://12480535 llvm-svn: 183489	2013-06-07 06:16:49 +00:00
Venkatraman Govindaraju	dc82ac0dcc	[Sparc]: Use cmp instruction instead of subcc to compare integers. llvm-svn: 183463	2013-06-07 00:03:36 +00:00
Kevin Enderby	fb5bddfd0a	Move the test for the data in code into the ARM directory as it is an ARM binary that is used for the test. Caught by Jim Grosbach! rdar://11791371 llvm-svn: 183442	2013-06-06 20:28:28 +00:00
Rafael Espindola	932470bcd9	Add a testcase from pr16244. llvm-svn: 183433	2013-06-06 19:15:23 +00:00
Kevin Enderby	273ae01b03	Teach llvm-objdump with the -macho parser how to use the data in code table from the LC_DATA_IN_CODE load command. And when disassembling print the data in code formatted for the kind of data it and not disassemble those bytes. I added the format specific functionality to the derived class MachOObjectFile since these tables only appears in Mach-O object files. This is my first attempt to modify the libObject stuff so if folks have better suggestions how to fit this in or suggestions on the implementation please let me know. rdar://11791371 llvm-svn: 183424	2013-06-06 17:20:50 +00:00
Rafael Espindola	e2e741ecdd	Print symbol names in relocations when dumping COFF as YAML. llvm-svn: 183403	2013-06-06 13:06:17 +00:00
Vincent Lejeune	dec1875207	R600: Add a pass that merge Vector Register Previously commited @183279 but tests were failing, reverted @183286 It was broken because @183336 was missing, now it's there. llvm-svn: 183343	2013-06-05 21:38:04 +00:00
Rafael Espindola	7c346c2cc9	Don't hide the first ELF symbol. The first symbol on ELF is dummy, but it has a defined content and readelf normally displays it. With this change llvm-readobj also displays it and we can check that llvm-mc output is correct according to the standard. llvm-svn: 183337	2013-06-05 20:33:54 +00:00
Vincent Lejeune	4b5b849753	R600: Schedule copy from phys register at beginning of block It allows regalloc pass to remove them by trivially assigning associated reg llvm-svn: 183336	2013-06-05 20:27:35 +00:00
Akira Hatanaka	da4496c860	[mips] brcond + setgt/setugt instruction selection patterns. llvm-svn: 183334	2013-06-05 19:49:55 +00:00
Michael Liao	00b20cc924	[PATCH] Fix VGATHER* operand constraints Add earlyclobber constaints to prevent input register being allocated as the output register because, according to Intel spec [1], "If any pair of the index, mask, or destination registers are the same, this instruction results a UD fault." --- [1] http://software.intel.com/sites/default/files/319433-014.pdf llvm-svn: 183327	2013-06-05 18:12:26 +00:00
Mihai Popa	0e9892fe3a	This is a simple patch that changes RRX and RRXS to accept all registers as operands. According to the ARM reference manual, RRX(S) have defined encodings for lr, pc and sp. llvm-svn: 183307	2013-06-05 13:23:51 +00:00
David Blaikie	6f1a8067fb	PR15662: Optimized debug info produces out of order function parameters When a function is inlined we lazily construct the variables representing the function's parameters. After that, we add any remaining unused parameters. If the function doesn't use all the parameters, or uses them out of order, then the DWARF would produce them in that order, producing a parameter order that doesn't match the source. This fix causes us to always keep the arg variables at the start of the variable list & in the original order from the source. llvm-svn: 183297	2013-06-05 05:39:59 +00:00
Tom Stellard	aad5376fb6	R600: Make sure to schedule AR register uses and defs in the same clause Reviewed-by: vljn at ovi.com llvm-svn: 183294	2013-06-05 03:43:06 +00:00
Rafael Espindola	0fd21ca699	Don't print default values for NumberOfAuxSymbols and AuxiliaryData. llvm-svn: 183293	2013-06-05 03:20:13 +00:00
Rafael Espindola	beef23fe21	Revert "R600: Add a pass that merge Vector Register" This reverts commit r183279. CodeGen/R600/texture-input-merge.ll was failing. llvm-svn: 183286	2013-06-05 01:48:30 +00:00
Rafael Espindola	806f006490	Handle relocations that don't point to symbols. In ELF (as in MachO), not all relocations point to symbols. Represent this properly by using a symbol_iterator instead of a SymbolRef. Update llvm-readobj ELF's dumper to handle relocatios without symbols. llvm-svn: 183284	2013-06-05 01:33:53 +00:00
Vincent Lejeune	a45aafabfe	R600: Add a pass that merge Vector Register llvm-svn: 183279	2013-06-04 23:17:26 +00:00
Vincent Lejeune	c689679173	R600: Const/Neg/Abs can be folded to dot4 llvm-svn: 183278	2013-06-04 23:17:15 +00:00
Evan Cheng	4ec309700b	Cortex-R5 can issue Thumb2 integer division instructions. llvm-svn: 183275	2013-06-04 22:52:09 +00:00
David Majnemer	29130c5e8d	IndVarSimplify: check if loop invariant expansion can trap IndVarSimplify is willing to move divide instructions outside of their loop bodies if they are invariant of the loop. However, it may not be safe to expand them if we do not know if they can trap. Instead, check to see if it is not safe to expand the instruction and skip the expansion. This fixes PR16041. Testcase by Rafael Ávila de Espíndola. llvm-svn: 183239	2013-06-04 17:51:58 +00:00
David Majnemer	452f1f97bd	ARM: Fix crash in ARM backend inside of ARMConstantIslandPass The ARM backend did not expect LDRBi12 to hold a constant pool operand. Allow for LLVM to deal with the instruction similar to how it deals with LDRi12. This fixes PR16215. llvm-svn: 183238	2013-06-04 17:46:15 +00:00
Vincent Lejeune	276ceb8d5f	R600: Swizzle texture/export instructions llvm-svn: 183229	2013-06-04 15:04:53 +00:00
Vincent Lejeune	db185c08cd	R600: Add a test for r183108 llvm-svn: 183228	2013-06-04 15:03:35 +00:00
Rafael Espindola	a5e536ab0e	Second part of pr16069 The problem this time seems to be a thinko. We were assuming that in the CFG A \| \ \| B \| / C speculating the basic block B would cause only the phi value for the B->C edge to be speculated. That is not true, the phi's are semantically in the edges, so if the A->B->C path is taken, any code needed for A->C is not executed and we have to consider it too when deciding to speculate B. llvm-svn: 183226	2013-06-04 14:11:59 +00:00
Alexey Samsonov	5239d58c8e	[llvm-symbolizer] Avoid calling slow getSymbolSize for Mach-O files. Assume that symbols with zero size are in fact large enough. llvm-svn: 183213	2013-06-04 07:57:38 +00:00
David Majnemer	c82f27af2a	SimplifyCFG: Do not transform PHI to select if doing so would be unsafe PR16069 is an interesting case where an incoming value to a PHI is a trap value while also being a 'ConstantExpr'. We do not consider this case when performing the 'HoistThenElseCodeToIf' optimization. Instead, make our modifications more conservative if we detect that we cannot transform the PHI to a select. llvm-svn: 183152	2013-06-03 20:43:12 +00:00
Rafael Espindola	f102438f3a	Enable mcjit tests on ppc64 when building with cmake. llvm-svn: 183143	2013-06-03 19:17:21 +00:00
Tom Stellard	94593ee8c3	R600/SI: Add support for work item and work group intrinsics llvm-svn: 183138	2013-06-03 17:40:18 +00:00
Tom Stellard	ed882c2f1b	R600/SI: Add a calling convention for compute shaders llvm-svn: 183137	2013-06-03 17:40:11 +00:00
Tom Stellard	046039e81b	R600/SI: Custom lower i64 sign_extend llvm-svn: 183136	2013-06-03 17:40:03 +00:00
Tom Stellard	07a10a3d3f	R600/SI: Add support for global loads llvm-svn: 183131	2013-06-03 17:39:43 +00:00
Vincent Lejeune	f83df1f1cb	R600: use capital letter for PV channel llvm-svn: 183107	2013-06-03 15:44:35 +00:00
Alexey Samsonov	213527d9c9	Correct handling invalid filename in llvm-symbolizer llvm-svn: 183102	2013-06-03 14:12:39 +00:00
Venkatraman Govindaraju	f80d72f149	Sparc: Add support for indirect branch and blockaddress in Sparc backend. llvm-svn: 183094	2013-06-03 05:58:33 +00:00
Rui Ueyama	f4d0a8c13f	[Object/COFF] Fix Windows .lib name handling. llvm-svn: 183091	2013-06-03 00:27:03 +00:00
Venkatraman Govindaraju	774fe2e29a	Sparc: When storing 0, use %g0 directly in the store instruction instead of using two instructions (sethi and store). llvm-svn: 183090	2013-06-03 00:21:54 +00:00
Venkatraman Govindaraju	0bbe1b210e	Sparc: Combine add/or/sethi instruction with restore if possible. llvm-svn: 183088	2013-06-02 21:48:17 +00:00
Venkatraman Govindaraju	3e8c7d98be	Sparc: Perform leaf procedure optimization by default llvm-svn: 183083	2013-06-02 02:24:27 +00:00
Nick Lewycky	3f715e260a	When determining the new index for an insertelement, we may not assume that an index greater than the size of the vector is invalid. The shuffle may be shrinking the size of the vector. Fixes a crash! Also drop the maximum recursion depth of the safety check for this optimization to five. llvm-svn: 183080	2013-06-01 20:51:31 +00:00
Venkatraman Govindaraju	28e2cd0e7e	Sparc: Mark functions calling llvm.vastart and llvm.returnaddress intrinsics as non-leaf functions. llvm-svn: 183079	2013-06-01 20:42:48 +00:00
Tim Northover	c35854077b	Disable new legacy JIT test on ARM. llvm-svn: 183071	2013-06-01 10:24:11 +00:00
Tim Northover	339bf154cc	Revert r183069: "TMP: LEA64_32r fixing" Very sorry, it was committed from the wrong branch by mistake. llvm-svn: 183070	2013-06-01 10:23:46 +00:00
Tim Northover	57954f04b3	TMP: LEA64_32r fixing llvm-svn: 183069	2013-06-01 10:21:54 +00:00
Tim Northover	3a1fd4c0ac	X86: change MOV64ri64i32 into MOV32ri64 The MOV64ri64i32 instruction required hacky MCInst lowering because it was allocated as setting a GR64, but the eventual instruction ("movl") only set a GR32. This converts it into a so-called "MOV32ri64" which still accepts a (appropriate) 64-bit immediate but defines a GR32. This is then converted to the full GR64 by a SUBREG_TO_REG operation, thus keeping everyone happy. This fixes a typo in the opcode field of the original patch, which should make the legact JIT work again (& adds test for that problem). llvm-svn: 183068	2013-06-01 09:55:14 +00:00
Venkatraman Govindaraju	3521dcdcc4	[Sparc] Generate correct code for leaf functions with stack objects llvm-svn: 183067	2013-06-01 04:51:18 +00:00
Andrew Trick	ee9143acf5	Prevent loop-unroll from making assumptions about undefined behavior. Fixes rdar:14036816, PR16130. There is an opportunity to compute precise trip counts for 'or' expressions and multi-exit loops. rdar:14038809: Optimize trip count computation for multi-exit loops. To do this we need to record the fact that ExitLimit assumes NSW. When it does not we can safely assume that the loop trip count is the minimum ExitLimt across all subexpressions and loop exits. llvm-svn: 183060	2013-05-31 23:34:46 +00:00
Eric Christopher	e1e57e5ebd	Temporarily Revert "X86: change MOV64ri64i32 into MOV32ri64" as it seems to have caused PR16192 and other JIT related failures. llvm-svn: 183059	2013-05-31 23:30:45 +00:00
Arnold Schwaighofer	70a9be5297	LoopVectorize: PHIs with only outside users should prevent vectorization We check that instructions in the loop don't have outside users (except if they are reduction values). Unfortunately, we skipped this check for if-convertable PHIs. Fixes PR16184. llvm-svn: 183035	2013-05-31 19:53:50 +00:00
Quentin Colombet	8aa7abe2ae	Modify how the formulae are rated in Loop Strength Reduce. Namely, check if the target allows to fold more that one register in the addressing mode and if yes, adjust the cost accordingly. Prior to this commit, reg1 + scale * reg2 accesses were artificially preferred to reg1 + reg2 accesses. Indeed, the cost model wrongly assumed that reg1 + reg2 needs a temporary register for the computation, whereas it was correctly estimated for reg1 + scale * reg2. <rdar://problem/13973908> llvm-svn: 183021	2013-05-31 17:20:29 +00:00
Andrew Trick	3c944ba2f0	Unit test for SCEV fix r182989, PR16130. llvm-svn: 183017	2013-05-31 16:42:41 +00:00
Tim Northover	4d14144024	ARM: permit upper-case BE/LE on setend instruction Patch by Amaury de la Vieuville. llvm-svn: 183012	2013-05-31 15:58:45 +00:00
Tim Northover	4173e29a98	ARM: add fstmx and fldmx instructions for assembly These instructions are deprecated oddities, but we still need to be able to disassemble (and reassemble) them if and when they're encountered. Patch by Amaury de la Vieuville. llvm-svn: 183011	2013-05-31 15:55:51 +00:00
Rafael Espindola	65281bf36e	Simplify multiplications by vectors whose elements are powers of 2. Patch by Andrea Di Biagio. llvm-svn: 183005	2013-05-31 14:27:15 +00:00
Tim Northover	1bb672da81	ARM: fix VEXT encoding corner case The disassembly of VEXT instructions was too lax in the bits checked. This fixes the case where the instruction affects Q-registers but a misaligned lane was specified (should be UNDEFINED). Patch by Amaury de la Vieuville llvm-svn: 183003	2013-05-31 13:47:25 +00:00
Richard Sandiford	30efd87f6e	[SystemZ] Don't use LOAD and STORE REVERSED for volatile accesses Unlike most -- hopefully "all other", but I'm still checking -- memory instructions we support, LOAD REVERSED and STORE REVERSED may access the memory location several times. This means that they are not suitable for volatile loads and stores. This patch is a prerequisite for better atomic load and store support. The same principle applies there: almost all memory instructions we support are inherently atomic ("block concurrent"), but LOAD REVERSED and STORE REVERSED are exceptions. Other instructions continue to allow volatile operands. I will add positive "allows volatile" tests at the same time as the "allows atomic load or store" tests. llvm-svn: 183002	2013-05-31 13:25:22 +00:00
Justin Holewinski	dbb3b2f4b6	[NVPTX] Re-enable support for virtual registers in the final output Now that 3.3 is branched, we are re-enabling virtual registers to help iron out bugs before the next release. Some of the post-RA passes do not play well with virtual registers, so we disable them for now. The needed functionality of the PrologEpilogInserter pass is copied to a new backend-specific NVPTXPrologEpilog pass. The test for this commit is not breaking the existing tests. llvm-svn: 182998	2013-05-31 12:14:49 +00:00
Evgeniy Stepanov	888385e40f	[msan] Handle mixed track-origins and keep-going settings (llvm part). Before this change, each module defined a weak_odr global __msan_track_origins with a value of 1 if origin tracking is enabled, 0 if disabled. If there are modules with different values, any of them may win. If 0 wins, and there is at least one module with 1, the program will most likely crash. With this change, __msan_track_origins is only emitted if origin tracking is on. Then runtime library detects if there is at least one module with origin tracking, and enables runtime support for it. llvm-svn: 182997	2013-05-31 12:04:29 +00:00
Tim Northover	d4736d67f4	X86: change MOV64ri64i32 into MOV32ri64 The MOV64ri64i32 instruction required hacky MCInst lowering because it was allocated as setting a GR64, but the eventual instruction ("movl") only set a GR32. This converts it into a so-called "MOV32ri64" which still accepts a (appropriate) 64-bit immediate but defines a GR32. This is then converted to the full GR64 by a SUBREG_TO_REG operation, thus keeping everyone happy. llvm-svn: 182991	2013-05-31 09:57:13 +00:00
Akira Hatanaka	2bf97336af	[mips] Big-endian code generation for atomic instructions. Patch by Jyun-Yan You. llvm-svn: 182984	2013-05-31 03:25:44 +00:00
Nick Lewycky	a2b7720618	Reapply with r182909 with a fix to the calculation of the new indices for insertelement instructions. llvm-svn: 182976	2013-05-31 00:59:42 +00:00
Rafael Espindola	99bd2ae479	Revert r182937 and r182877. r182877 broke MCJIT tests on ARM and r182937 was working around another failure by r182877. This should make the ARM bots green. llvm-svn: 182960	2013-05-30 20:37:52 +00:00
Rafael Espindola	cf6bde9e2b	Don't use fast isel on this test. This fixes the test on ARM. Looks like it was broken by r182877. Not sure if this is a bug on fast isel on ARM, but this should help fix the ARM bots. llvm-svn: 182937	2013-05-30 16:29:28 +00:00
Benjamin Kramer	54cd84861e	Force a triple so we don't get bitten by windows' different regalloc. llvm-svn: 182935	2013-05-30 15:39:35 +00:00
Benjamin Kramer	dc93c8d50c	Force fragile test to the atom scheduler model. The pattern the test originally checked for doesn't occur on other -mcpu settings. On atom it's still there though slightly differently scheduled. llvm-svn: 182933	2013-05-30 15:22:28 +00:00
Tim Northover	c0b42a257d	X86: allow registers 8-15 in test This test was failing on some hosts when an unexpected register was used for a variable. This just extends the regexp to allow the new x86-64 registers. llvm-svn: 182929	2013-05-30 13:56:32 +00:00
Tim Northover	64ec0ff433	X86: use sub-register sequences for MOV*r0 operations Instead of having a bunch of separate MOV8r0, MOV16r0, ... pseudo-instructions, it's better to use a single MOV32r0 (which will expand to "xorl %reg, %reg") and obtain other sizes with EXTRACT_SUBREG and SUBREG_TO_REG. The encoding is smaller and partial register updates can sometimes be avoided. Until recently, this sequence was a barrier to rematerialization though. That should now be fixed so it's an appropriate time to make the change. llvm-svn: 182928	2013-05-30 13:19:42 +00:00
Justin Holewinski	994d66a345	[NVPTX] Fix case where a sext load of an i1 type may produce an ld.u1 instead of an ld.u8. llvm-svn: 182924	2013-05-30 12:22:39 +00:00
Richard Sandiford	46af5a2cdc	[SystemZ] Enable unaligned accesses The code to distinguish between unaligned and aligned addresses was already there, so this is mostly just a switch-on-and-test process. llvm-svn: 182920	2013-05-30 09:45:42 +00:00
Evgeniy Stepanov	2c14269883	Revert r182909. PR/16177 llvm-svn: 182919	2013-05-30 09:40:17 +00:00
Nick Lewycky	d7f27094c0	Swizzle vector inputs if it helps us eliminate shuffles. llvm-svn: 182909	2013-05-30 04:33:38 +00:00
Rafael Espindola	4f60a38f18	Change how we iterate over relocations on ELF. For COFF and MachO, sections semantically have relocations that apply to them. That is not the case on ELF. In relocatable objects (.o), a section with relocations in ELF has offsets to another section where the relocations should be applied. In dynamic objects and executables, relocations don't have an offset, they have a virtual address. The section sh_info may or may not point to another section, but that is not actually used for resolving the relocations. This patch exposes that in the ObjectFile API. It has the following advantages: * Most (all?) clients can handle this more efficiently. They will normally walk all relocations, so doing an effort to iterate in a particular order doesn't save time. * llvm-readobj now prints relocations in the same way the native readelf does. * probably most important, relocations that don't point to any section are now visible. This is the case of relocations in the rela.dyn section. See the updated relocation-executable.test for example. llvm-svn: 182908	2013-05-30 03:05:14 +00:00
Bill Wendling	2aa007c59c	This testcase tests command line attributes which we don't yet support. In fact, we're probably going to support these flags in completely different ways. So this test is no longer valid. llvm-svn: 182899	2013-05-30 00:32:04 +00:00
Andrew Trick	ad6d08ac6f	Order CALLSEQ_START and CALLSEQ_END nodes. Fixes PR16146: gdb.base__call-ar-st.exp fails after pre-RA-sched=source fixes. Patch by Xiaoyi Guo! This also fixes an unsupported dbg.value test case. Codegen was previously incorrect but the test was passing by luck. llvm-svn: 182885	2013-05-29 22:03:55 +00:00
JF Bastien	f60e0e44ca	Enable FastISel on ARM for Linux and NaCl FastISel was only enabled for iOS ARM and Thumb2, this patch enables it for ARM (not Thumb2) on Linux and NaCl. Thumb2 support needs a bit more work, mainly around register class restrictions. The patch punts to SelectionDAG when doing TLS relocation on non-Darwin targets. I will fix this and other FastISel-to-SelectionDAG failures in a separate patch. The patch also forces FastISel to retain frame pointers: iOS always keeps them for backtracking (so emitted code won't change because of this), but Linux was getting much worse code that was incorrect when using big frames (such as test-suite's lencod). I'll also fix this in a later patch, it will probably require a peephole so that FastISel doesn't rematerialize frame pointers back-to-back. The test changes are straightforward, similar to: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130513/174279.html They also add a vararg test that got dropped in that change. I ran all of test-suite on A15 hardware with --optimize-option=-O0 and all the tests pass. llvm-svn: 182877	2013-05-29 20:38:10 +00:00
Tim Northover	b65f6b0820	Teach ReMaterialization to be more cunning about subregisters This allows rematerialization during register coalescing to handle more cases involving operations like SUBREG_TO_REG which might need to be rematerialized using sub-register indices. For example, code like: v1(GPR64):sub_32 = MOVZ something v2(GPR64) = COPY v1(GPR64) should be convertable to: v2(GPR64):sub_32 = MOVZ something but previously we just gave up in places like this llvm-svn: 182872	2013-05-29 19:32:06 +00:00
Manman Ren	4213c39e3c	LTO+Debug Info: revert r182791. Since the testing case uses ref_addr, which requires version 3+ to work, we will solve the dwarf version issue first. This patch also causes failures in one of the bots. I will update the patch accordingly in my next attempt. rdar://13926659 llvm-svn: 182867	2013-05-29 17:16:59 +00:00
Richard Sandiford	ba97c34bb6	[SystemZ] Two tests missing from previous commit llvm-svn: 182847	2013-05-29 11:59:26 +00:00
Richard Sandiford	e1d9f00f09	[SystemZ] Immediate compare-and-branch support This patch adds support for the CIJ and CGIJ instructions. llvm-svn: 182846	2013-05-29 11:58:52 +00:00
Benjamin Kramer	490bc1a27f	Move test that depends on the X86 backend into the right subdirectory. llvm-svn: 182834	2013-05-29 08:40:49 +00:00
Venkatraman Govindaraju	ca0fe2f57e	[Sparc] Add support for leaf functions in sparc backend. llvm-svn: 182822	2013-05-29 04:46:31 +00:00
Jack Carter	0259300325	Mips assembler: Improve set register alias handling This patch solves the problem of numeric register values not being accepted: ../set_alias.s:1:11: error: expected valid expression after comma .set r4,$4 ^ The parsing of .set directive is changed and handling of symbols in code as well to enable this feature. The test example is added. Patch by Vladimir Medic llvm-svn: 182807	2013-05-28 22:21:05 +00:00
Paul Redmond	5fdf836ba4	Add support for llvm.vectorizer metadata - llvm.loop.parallel metadata has been renamed to llvm.loop to be more generic by making the root of additional loop metadata. - Loop::isAnnotatedParallel now looks for llvm.loop and associated llvm.mem.parallel_loop_access - document llvm.loop and update llvm.mem.parallel_loop_access - add support for llvm.vectorizer.width and llvm.vectorizer.unroll - document llvm.vectorizer.* metadata - add utility class LoopVectorizerHints for getting/setting loop metadata - use llvm.vectorizer.width=1 to indicate already vectorized instead of already_vectorized - update existing tests that used llvm.loop.parallel and llvm.vectorizer.already_vectorized Reviewed by: Nadav Rotem llvm-svn: 182802	2013-05-28 20:00:34 +00:00
Tim Northover	3b684d8359	ARM: use pristine object file while processing relocations Previously we would read-modify-write the target bits when processing relocations for the MCJIT. This had the problem that when relocations were processed multiple times for the same object file (as they can be), the result is not idempotent and the values became corrupted. The solution to this is to take any bits used in the destination from the pristine object file as LLVM emitted it. This should fix PR16013 and remote MCJIT on ARM ELF targets. llvm-svn: 182800	2013-05-28 19:48:19 +00:00
Manman Ren	b5b5453e61	LTO+Debug Info: correctly emit inlined_subroutine when the inlined callee is from a different CU. We used to print out an error message and fail to generate inlined_subroutine. If we use ref_addr in the generated DWARF, the DWARF version should be 3 or above. rdar://13926659 llvm-svn: 182791	2013-05-28 19:01:58 +00:00
James Molloy	f6f121e277	Extend RemapInstruction and friends to take an optional new parameter, a ValueMaterializer. Extend LinkModules to pass a ValueMaterializer to RemapInstruction and friends to lazily create Functions for lazily linked globals. This is a big win when linking small modules with large (mostly unused) library modules. llvm-svn: 182776	2013-05-28 15:17:05 +00:00
Evgeniy Stepanov	fca012334b	[msan] Fix argument shadow alignment. llvm-svn: 182771	2013-05-28 13:07:43 +00:00
Richard Sandiford	0fb90ab0cb	[SystemZ] Register compare-and-branch support This patch adds support for the CRJ and CGRJ instructions. Support for the immediate forms will be a separate patch. The architecture has a large number of comparison instructions. I think it's generally better to concentrate on using the "best" comparison instruction first and foremost, then only use something like CRJ if CR really was the natual choice of comparison instruction. The patch therefore opportunistically converts separate CR and BRC instructions into a single CRJ while emitting instructions in ISelLowering. llvm-svn: 182764	2013-05-28 10:41:11 +00:00
Michael Kuperstein	f3e663af39	Make BasicAliasAnalysis recognize the fact a noalias argument cannot alias another argument, even if the other argument is not itself marked noalias. llvm-svn: 182755	2013-05-28 08:17:48 +00:00
Preston Gurd	048f99de11	Convert sqrt functions into sqrt instructions when -ffast-math is in effect. When -ffast-math is in effect (on Linux, at least), clang defines __FINITE_MATH_ONLY__ > 0 when including <math.h>. This causes the preprocessor to include <bits/math-finite.h>, which renames the sqrt functions. For instance, "sqrt" is renamed as "__sqrt_finite". This patch adds the 3 new names in such a way that they will be treated as equivalent to their respective original names. llvm-svn: 182739	2013-05-27 15:44:35 +00:00
Rafael Espindola	cca5f562db	Add a cpu to try to bring back the atom bots. llvm-svn: 182734	2013-05-27 13:22:52 +00:00
Hal Finkel	7d8a691b5d	Prefer to duplicate PPC Altivec loads when expanding unaligned loads When expanding unaligned Altivec loads, we use the decremented offset trick to prevent page faults. Unfortunately, if we have a sequence of consecutive unaligned loads, this leads to suboptimal code generation because the 'extra' load from the first unaligned load can be combined with the base load from the second (but only if the decremented offset trick is not used for the first). Search up and down the chain, through loads and token factors, looking for consecutive loads, and if one is found, don't use the offset reduction trick. These duplicate loads are later combined to yield the desired sequence (in the future, we might want a more-powerful chain search, but that will require some changes to allow the combiner routines to access the AA object). This should complete the initial implementation of the optimized unaligned Altivec load expansion. There is some refactoring that should be done, but that will happen when the unaligned store expansion is added. llvm-svn: 182719	2013-05-26 18:08:30 +00:00
Andrew Trick	c66d26adf0	Fix PR16143: Insert DEBUG_VALUE before terminator. llvm-svn: 182717	2013-05-26 08:58:50 +00:00
Cameron Zwarich	80cbcd2d11	Add support for DWARF line number table entries for values in the instruction stream. llvm-svn: 182712	2013-05-25 21:56:53 +00:00
Hal Finkel	bc2ee4c4e6	PPC: Combine duplicate (offset) lvsl Altivec intrinsics The lvsl permutation control instruction is a function only of the alignment of the pointer operand (relative to the 16-byte natural alignment of Altivec vectors). As a result, multiple lvsl intrinsics where the operands differ by a multiple of 16 can be combined. llvm-svn: 182708	2013-05-25 04:05:05 +00:00
Andrew Trick	8972aba193	Track IR ordering of SelectionDAG nodes 4/4. Unit test cases for -pre-RA-sched=source. llvm-svn: 182706	2013-05-25 03:26:51 +00:00
Andrew Trick	e2431c64bc	Track IR ordering of SelectionDAG nodes 3/4. Remove the old IR ordering mechanism and switch to new one. Fix unit test failures. llvm-svn: 182704	2013-05-25 03:08:10 +00:00
Hal Finkel	cf2e908014	PPC: Initial support for permutation-based unaligned Altivec loads Altivec only directly supports aligned loads, but the loads have a strange property: If given an unaligned address, they truncate the address to the next lower aligned address, and load from there. This property, along with an extra load and some special-purpose permutation-control instructions that generate the appropriate permutations from the original unaligned address, allow efficient lowering of aligned loads. This code uses the trick explained in the Apple Velocity Engine optimization overview document to prevent the needed extra load from possibly causing a page fault if the original address happens to be aligned. As noted in the FIXMEs, there are several additional optimizations that can be performed to reduce the cost of these loads even more. These will be implemented in future commits. llvm-svn: 182691	2013-05-24 23:00:14 +00:00
Michael Gottesman	e67f40c514	[objc-arc] KnownSafe does not imply that it is safe to perform code motion across CFG edges since even if it is safe to remove RR pairs, we may still be able to move a retain/release into a loop. rdar://13949644 llvm-svn: 182670	2013-05-24 20:44:05 +00:00
Michael Gottesman	5a91bbf33a	[objc-arc] Make sure that multiple owners is propogated correctly through the pass via the usage of a global data structure. rdar://13750319 llvm-svn: 182669	2013-05-24 20:44:02 +00:00
Benjamin Kramer	6ac1e62377	LoopVectorize: LoopSimplify can't canonicalize loops with an indirectbr in it, don't assert on those cases. Fixes PR16139. llvm-svn: 182656	2013-05-24 18:05:35 +00:00
Richard Sandiford	dc5ed71353	[SystemZ] Improve AsmParser handling of invalid instructions Previously, an invalid instruction like: foo %r1, %r0 would generate the rather odd error message: ....: error: unknown token in expression foo %r1, %r0 ^ We now get the more informative: ....: error: invalid instruction foo %r1, %r0 ^ The same would happen if an address were used where a register was expected. We now get "invalid operand for instruction" instead. llvm-svn: 182644	2013-05-24 14:26:46 +00:00
Richard Sandiford	675f86996a	[SystemZ] Improve AsmParser register parsing The idea is to make sure that: (1) "register expected" is restricted to cases where ParseRegister() is called and the token obviously isn't a register. (2) "invalid register" is restricted to cases where a register-like "%..." sequence is found, but the "..." makes no sense. (3) the generic "invalid operand for instruction" is used in cases where the wrong register type is used (GPR instead of FPR, etc.). (4) the new "invalid register pair" is used if the register has the right type, but is not a valid register pair. Testing of (1)-(3) is now restricted to regs-bad.s. It uses a representative instruction for each register class to make sure that only registers from that class are accepted. (4) is tested by both regs-bad.s (which checks all invalid register pairs) and insn-bad.s (which tests one invalid pair for each instruction that requires a pair). While there, I changed "Number" to "Num" for consistency with the operand class. llvm-svn: 182643	2013-05-24 14:14:38 +00:00
Joey Gouly	83699284be	scalarizePHI needs to insert the next ExtractElement in the same block as the BinaryOperator, not in the block where the IRBuilder is currently inserting into. Fixes a bug where scalarizePHI would create instructions that would not dominate all uses. llvm-svn: 182639	2013-05-24 12:29:54 +00:00
Diego Novillo	c63995394d	Add a new function attribute 'cold' to functions. Other than recognizing the attribute, the patch does little else. It changes the branch probability analyzer so that edges into blocks postdominated by a cold function are given low weight. Added analysis and code generation tests. Added documentation for the new attribute. llvm-svn: 182638	2013-05-24 12:26:52 +00:00
Ahmed Bougacha	ad1084de84	Add MCSymbolizer for symbolic/annotated disassembly. This is a basic first step towards symbolization of disassembled instructions. This used to be done using externally provided (C API) callbacks. This patch introduces: - the MCSymbolizer class, that mimics the same functions that were used in the X86 and ARM disassemblers to symbolize immediate operands and to annotate loads based off PC (for things like c string literals). - the MCExternalSymbolizer class, which implements the old C API. - the MCRelocationInfo class, which provides a way for targets to translate relocations (either object::RelocationRef, or disassembler C API VariantKinds) to MCExprs. - the MCObjectSymbolizer class, which does symbolization using what it finds in an object::ObjectFile. This makes simple symbolization (with no fancy relocation stuff) work for all object formats! - x86-64 Mach-O and ELF MCRelocationInfos. - A basic ARM Mach-O MCRelocationInfo, that provides just enough to support the C API VariantKinds. Most of what works in otool (the only user of the old symbolization API that I know of) for x86-64 symbolic disassembly (-tvV) works, namely: - symbol references: call _foo; jmp 15 <_foo+50> - relocations: call _foo-_bar; call _foo-4 - __cf?string: leaq 193(%rip), %rax ## literal pool for "hello" Stub support is the main missing part (because libObject doesn't know, among other things, about mach-o indirect symbols). As for the MCSymbolizer API, instead of relying on the disassemblers to call the tryAdding* methods, maybe this could be done automagically using InstrInfo? For instance, even though PC-relative LEAs are used to get the address of string literals in a typical Mach-O file, a MOV would be used in an ELF file. And right now, the explicit symbolization only recognizes PC-relative LEAs. InstrInfo should have already have most of what is needed to know what to symbolize, so this can definitely be improved. I'd also like to remove object::RelocationRef::getValueString (it seems only used by relocation printing in objdump), as simply printing the created MCExpr is definitely enough (and cleaner than string concats). llvm-svn: 182625	2013-05-24 00:39:57 +00:00
Tim Northover	bc93308489	ARM: implement @llvm.readcyclecounter intrinsic This implements the @llvm.readcyclecounter intrinsic as the specific MRC instruction specified in the ARM manuals for CPUs with the Power Management extensions. Older CPUs had slightly different methods which may also have to be implemented eventually, but this should cover all v7 cases. rdar://problem/13939186 llvm-svn: 182603	2013-05-23 19:11:20 +00:00
Tom Stellard	1b086cbcb8	R600: Fix R600ControlFlowFinalizer not considering VTX_READ 128 bit dst reg Patch by: Vincent Lejeune https://bugs.freedesktop.org/show_bug.cgi?id=64877 NOTE: This is a candidate for the 3.3 branch. llvm-svn: 182600	2013-05-23 18:26:42 +00:00
Jakob Stoklund Olesen	43711c51ec	Fix PR16110: Handle DBG_VALUE in ConnectedVNInfoEqClasses::Distribute(). Now that the LiveDebugVariables pass is running after register coalescing, the ConnectedVNInfoEqClasses class needs to deal with DBG_VALUE instructions. This only comes up when rematerialization during coalescing causes the remaining live range of a virtual register to separate into two connected components. llvm-svn: 182592	2013-05-23 17:02:23 +00:00
Nick Lewycky	7b431030ac	Add missing test from r175092. llvm-svn: 182564	2013-05-23 07:46:13 +00:00
David Blaikie	5174c84add	Solidify the assumption that a DW_TAG_subprogram's type is a DW_TAG_subroutine_type There were bits & pieces of code lying around that may've given the impression that debug info metadata supported the possibility that a subprogram's type could be specified by a non-subroutine type describing the return type of a void function. This support was incomplete & unnecessary. Asserts & API have been changed to make the desired usage more clear. llvm-svn: 182532	2013-05-22 23:22:18 +00:00
Nadav Rotem	9e00eb38a2	SLPVectorizer: Change the order in which new instructions are added to the function. We are not working on a DAG and I ran into a number of problems when I enabled the vectorizations of 'diamond-trees' (trees that share leafs). * Imroved the numbering API. * Changed the placement of new instructions to the last root. * Fixed a bug with external tree users with non-zero lane. * Fixed a bug in the placement of in-tree users. llvm-svn: 182508	2013-05-22 19:47:32 +00:00
Nadav Rotem	7b66c47051	X86: Fix a bug in EltsFromConsecutiveLoads. We can't generate new loads without chains. llvm-svn: 182507	2013-05-22 19:28:41 +00:00
Jean-Luc Duprat	0dda6f168c	This is an update to a previous commit (r181216). The earlier change list introduced the following inst combines: B * (uitofp i1 C) —> select C, B, 0 A * (1 - uitofp i1 C) —> select C, 0, A select C, 0, B + select C, A, 0 —> select C, A, B Together these 3 changes would simplify : A * (1 - uitofp i1 C) + B * uitofp i1 C down to : select C, B, A In practice we found that the first two substitutions can have a negative effect on performance, because they reduce opportunities to use FMA contractions; between the two options FMAs are often the better choice. This change list amends the previous one to enable just these inst combines: select C, B, 0 + select C, 0, A —> select C, B, A A * (1 - uitofp i1 C) + B * uitofp i1 C —> select C, B, A llvm-svn: 182499	2013-05-22 18:29:31 +00:00
Benjamin Kramer	d76cc186fc	X86: When expanding PCMPGTQ to PCMPGTD we always want to compare the lower halves as unsigned. Take #2 on fixing PR15977. llvm-svn: 182486	2013-05-22 17:01:12 +00:00
Arnold Schwaighofer	12b0d1cda0	LoopVectorize: Make Value pointers that could be RAUW'ed a VH The Value pointers we store in the induction variable list can be RAUW'ed by a call to SCEVExpander::expandCodeFor, use a TrackingVH instead. Do the same thing in some other places where we store pointers that could potentially be RAUW'ed. Fixes PR16073. llvm-svn: 182485	2013-05-22 16:54:56 +00:00
David Majnemer	7ea2a52a0c	X86: Remove test instructions proceeding shift by immediate instructions Allow LLVM to take advantage of shift instructions that set the ZF flag, making instructions that test the destination superfluous. llvm-svn: 182454	2013-05-22 08:13:02 +00:00
Rafael Espindola	c823f00ed1	Use std::list so that we have a stable iterator. I will try to avoid creating these std::strings, but for now this gets the tests passing with libc++. llvm-svn: 182405	2013-05-21 18:53:50 +00:00
Akira Hatanaka	be76cd0b8e	[mips] Rename option to make it compatible with gcc. llvm-svn: 182397	2013-05-21 17:17:59 +00:00
Akira Hatanaka	6871031be9	[mips] Add instruction selection patterns for blez and bgez. llvm-svn: 182396	2013-05-21 17:13:47 +00:00
Justin Holewinski	48f4ad3fc0	[NVPTX] Add @llvm.nvvm.sqrt.f() intrinsic llvm-svn: 182394	2013-05-21 16:51:30 +00:00
Justin Holewinski	fff1f5f5e2	Drop @llvm.annotation and @llvm.ptr.annotation intrinsics during codegen. The intrinsic calls are dropped, but the annotated value is propagated. Fixes PR 15253 Original patch by Zeng Bin! llvm-svn: 182387	2013-05-21 14:37:16 +00:00
Evgeniy Stepanov	ebd7f8e7ef	[msan] A no-op implementation of VarArg handling. This stuff is used on platforms where MSan does not have a proper VarArg implementation (anything other than x86_64 at the moment). llvm-svn: 182375	2013-05-21 12:27:47 +00:00
Benjamin Kramer	18ef6b22b9	X86: When emulating unsigned PCMPGTQ with PCMPGTD, fix the sign bit for the smaller type. Otherwise we'll get a mix of signed and unsigned compares. Fixes PR15977. llvm-svn: 182364	2013-05-21 09:58:54 +00:00
Richard Sandiford	586f41777e	[SystemZ] Tighten branch tests After r182274, the branches in these tests must always be short. llvm-svn: 182358	2013-05-21 08:53:17 +00:00
Benjamin Kramer	8aaf197990	DAGCombine: Avoid an edge case where it tried to create an i0 type for (x & 0) == 0. Fixes PR16083. llvm-svn: 182357	2013-05-21 08:51:09 +00:00
Reed Kotler	75653a0677	Add checks that the proper predeined stubs are being called to the test case. These were accidentally omitted. llvm-svn: 182347	2013-05-21 01:27:36 +00:00
Manman Ren	9d4c735885	Dwarf: use a single line table to generate assembly when .loc is used. This is to fix PR15408 where an undefined symbol Lline_table_start1 is used. Since we do not generate the debug_line section when .loc is used, Lline_table_start1 is not emitted and we can't refer to it when calculating at_stmt_list for a compile unit. llvm-svn: 182344	2013-05-21 00:57:22 +00:00
Reed Kotler	0fed8d4ef7	Add some additional functions to the list of helper functions for pic calls. These need to be there so we don't try and use helper functions when we call those. As part of this, make sure that we properly exclude helper functions in pic mode when indirect calls are involved. llvm-svn: 182343	2013-05-21 00:50:30 +00:00
David Blaikie	e63d5d1633	PR14606: Debug Info for namespace aliases/DW_TAG_imported_module This resolves the last of the PR14606 failures in the GDB 7.5 test suite by implementing an optional name field for DW_TAG_imported_modules/DIImportedEntities and using that to implement C++ namespace aliases (eg: "namespace X = Y;"). llvm-svn: 182328	2013-05-20 22:50:35 +00:00
Sebastian Pop	0bfafbaf52	add polly to check-all llvm-svn: 182308	2013-05-20 18:49:15 +00:00
Akira Hatanaka	5de4416962	[mips] Add (setne $lhs, 0) instruction selection pattern. llvm-svn: 182307	2013-05-20 18:18:07 +00:00
Akira Hatanaka	1cb024207f	[mips] Trap on integer division by zero. By default, a teq instruction is inserted after integer divide. No divide-by-zero checks are performed if option "-mnocheck-zero-division" is used. llvm-svn: 182306	2013-05-20 18:07:43 +00:00
Justin Holewinski	4c47d87ba6	[NVPTX] Fix mis-use of CurrentFnSym in NVPTXAsmPrinter. This was causing a symbol name error in the output PTX. llvm-svn: 182298	2013-05-20 16:42:18 +00:00
Tom Stellard	f0de44cc89	R600: Fix rotr.ll on non-asserts builds The -debug-only option is only available on asserts builds. llvm-svn: 182291	2013-05-20 15:28:48 +00:00
Tom Stellard	d2eebf001e	R600/SI: Add pattern for rotr Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 182286	2013-05-20 15:02:24 +00:00
Tom Stellard	5643c4ac72	R600: Swap the legality of rotl and rotr The hardware supports rotr and not rotl. llvm-svn: 182285	2013-05-20 15:02:19 +00:00
Tom Stellard	1cfd7a50bb	R600/SI: Add patterns for 64-bit shift operations Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 182284	2013-05-20 15:02:12 +00:00
Mihai Popa	f41e3f56a5	VSTn instructions have a number of encoding constraints which are not implemented. I have added these using wrapper methods around the original custom decoder (incidentally - this is a huge poorly written method that should be cleaned up. I have left it as is since the changes would be much to hard to review). llvm-svn: 182281	2013-05-20 14:57:05 +00:00
Mihai Popa	dcf0922720	Q registers are encoded in fields of the same length as D registers. As Q registers are half as many, the ARM reference manual mandates the least significant bit to be zeroed out. Failure to do so should result in an undefined instruction. With this change test/MC/Disassembler/ARM/invalid-VQADD-arm.txt is passing (removed XFAIL). llvm-svn: 182279	2013-05-20 14:42:43 +00:00
Richard Sandiford	312425f32d	[SystemZ] Add long branch pass Before this change, the SystemZ backend would use BRCL for all branches and only consider shortening them to BRC when generating an object file. E.g. a branch on equal would use the JGE alias of BRCL in assembly output, but might be shortened to the JE alias of BRC in ELF output. This was a useful first step, but it had two problems: (1) The z assembler isn't traditionally supposed to perform branch shortening or branch relaxation. We followed this rule by not relaxing branches in assembler input, but that meant that generating assembly code and then assembling it would not produce the same result as going directly to object code; the former would give long branches everywhere, whereas the latter would use short branches where possible. (2) Other useful branches, like COMPARE AND BRANCH, do not have long forms. We would need to do something else before supporting them. (Although COMPARE AND BRANCH does not change the condition codes, the plan is to model COMPARE AND BRANCH as a CC-clobbering instruction during codegen, so that we can safely lower it to a separate compare and long branch where necessary. This is not a valid transformation for the assembler proper to make.) This patch therefore moves branch relaxation to a pre-emit pass. For now, calls are still shortened from BRASL to BRAS by the assembler, although this too is not really the traditional behaviour. The first test takes about 1.5s to run, and there are likely to be more tests in this vein once further branch types are added. The feeling on IRC was that 1.5s is a bit much for a single test, so I've restricted it to SystemZ hosts for now. The patch exposes (and fixes) some typos in the main CodeGen/SystemZ tests. A later patch will remove the {{g}}s from that directory. llvm-svn: 182274	2013-05-20 14:23:08 +00:00
Justin Holewinski	01f89f0428	[NVPTX] Add GenericToNVVM IR converter to better handle idiomatic LLVM IR inputs This converter currently only handles global variables in address space 0. For these variables, they are promoted to address space 1 (global memory), and all uses are updated to point to the result of a cvta.global instruction on the new variable. The motivation for this is address space 0 global variables are illegal since we cannot declare variables in the generic address space. Instead, we place the variables in address space 1 and explicitly convert the pointer to address space 0. This is primarily intended to help new users who expect to be able to place global variables in the default address space. llvm-svn: 182254	2013-05-20 12:13:32 +00:00
Justin Holewinski	700b6fa934	[NVPTX] Fix i1 kernel parameters and global variables. ABI rules say we need to use .u8 for i1 parameters for kernels. llvm-svn: 182253	2013-05-20 12:13:28 +00:00
Stepan Dyatkovskiy	d0e34a200f	PR15868 fix. Introduction: In case when stack alignment is 8 and GPRs parameter part size is not N8: we add padding to GPRs part, so part's last byte must be recovered at address K8-1. We need to do it, since remained (stack) part of parameter starts from address K8, and we need to "attach" "GPRs head" without gaps to it: Stack: \|---- 8 bytes block ----\| \|---- 8 bytes block ----\| \|---- 8 bytes... [ [padding] [GPRs head] ] [ ------ Tail passed via stack ------ ... FIX: Note, once we added padding we need to correct all* Arg offsets that are going after padded one. That's why we need this fix: Arg offsets were never corrected before this patch. See new test-cases included in patch. We also don't need to insert padding for byval parameters that are stored in GPRs only. We need pad only last byval parameter and only in case it outsides GPRs and stack alignment = 8. Though, stack area, allocated for recovered byval params, must satisfy "Size mod 8 = 0" restriction. This patch reduces stack usage for some cases: We can reduce ArgRegsSaveArea since inner N*4 bytes sized byval params my be "packed" with alignment 4 in some cases. llvm-svn: 182237	2013-05-20 08:01:34 +00:00
Renato Golin	9e18922d67	Disable remote MCJIT on pre-v6 ARM llvm-svn: 182235	2013-05-20 07:46:06 +00:00
Jakob Stoklund Olesen	f927800325	Also expand 64-bit bitcasts. llvm-svn: 182229	2013-05-20 01:01:43 +00:00
Jakob Stoklund Olesen	c7bc5fbc5c	Implement spill and fill of I64Regs. llvm-svn: 182228	2013-05-20 00:53:25 +00:00
Jakob Stoklund Olesen	751e9b8407	Mark i64 SETCC as expand so it is turned into a SELECT_CC. llvm-svn: 182227	2013-05-20 00:28:36 +00:00
Jakob Stoklund Olesen	86c5469d26	Don't use %g0 to materialize 0 directly. The wired physreg doesn't work on tied operands like on MOVXCC. Add a README note to fix this later. llvm-svn: 182225	2013-05-19 21:47:13 +00:00
Jakob Stoklund Olesen	92ebf1153e	Select i64 values with %icc conditions. llvm-svn: 182224	2013-05-19 20:38:21 +00:00
Jakob Stoklund Olesen	7ca944b9db	Add floating point selects on %xcc predicates. llvm-svn: 182222	2013-05-19 20:33:11 +00:00
Jakob Stoklund Olesen	4a78c86a6a	Implement SPselectfcc for i64 operands. Also clean up the arguments to all the MOVCC instructions so the operands always are (true-val, false-val, cond-code). llvm-svn: 182221	2013-05-19 20:20:54 +00:00
Venkatraman Govindaraju	3320e5a921	[Sparc] Rearrange integer registers' allocation order so that register allocator will use I and G registers before using L and O registers. Also, enable registers %g2-%g4 to be used in application and %g5 in 64 bit mode. llvm-svn: 182219	2013-05-19 20:07:20 +00:00
Jakob Stoklund Olesen	ead983cec9	Handle i64 FrameIndex nodes in SPARC v9 mode. llvm-svn: 182216	2013-05-19 19:14:24 +00:00
Tim Northover	77d0a4ac62	Invalidate instruction cache when setting memory to be executable. lli's remote MCJIT code calls setExecutable just prior to running code. In line with Darwin behaviour this seems to be the place to invalidate any caches needed so that relocations can take effect properly. llvm-svn: 182213	2013-05-19 15:28:16 +00:00
Bob Wilson	1dbf9a236f	Temporarily disable this test because it is failing when using libc++. llvm-svn: 182212	2013-05-19 14:59:08 +00:00
Benjamin Kramer	75f0923ab2	Move the remaining simplify-libcalls tests to instcombine, merging most of them into a single file. llvm-svn: 182211	2013-05-19 13:28:39 +00:00
Renato Golin	d684165620	Unsupported remote JIT on ARM llvm-svn: 182201	2013-05-18 19:42:07 +00:00
David Majnemer	beab5678a3	isKnownToBeAPowerOfTwo: (X & Y) + Y is a power of 2 or zero if y is also. This is useful if something that looks like (x & (1 << y)) ? 64 : 32 is the divisor in a modulo operation. llvm-svn: 182200	2013-05-18 19:30:37 +00:00
Arnold Schwaighofer	693a1ca628	LoopVectorize: Handle single edge PHIs We might encouter single edge PHIs - handle them with an identity select. Fixes PR15990. llvm-svn: 182199	2013-05-18 18:38:34 +00:00
Hal Finkel	2f474f0e8a	Check InlineAsm clobbers in PPCCTRLoops We don't need to reject all inline asm as using the counter register (most does not). Only those that explicitly clobber the counter register need to prevent the transformation. llvm-svn: 182191	2013-05-18 09:20:39 +00:00
David Majnemer	5ba473afb0	X86: Bad peephole interaction between adc, MOV32r0 The peephole tries to reorder MOV32r0 instructions such that they are before the instruction that modifies EFLAGS. The problem is that the peephole does not consider the case where the instruction that modifies EFLAGS also depends on the previous state of EFLAGS. Instead, walk backwards until we find an instruction that has a def for EFLAGS but does not have a use. If we find such an instruction, insert the MOV32r0 before it. If it cannot find such an instruction, skip the optimization. llvm-svn: 182184	2013-05-18 01:02:03 +00:00
JF Bastien	97b08c404c	Support unaligned load/store on more ARM targets This patch matches GCC behavior: the code used to only allow unaligned load/store on ARM for v6+ Darwin, it will now allow unaligned load/store for v6+ Darwin as well as for v7+ on Linux and NaCl. The distinction is made because v6 doesn't guarantee support (but LLVM assumes that Apple controls hardware+kernel and therefore have conformant v6 CPUs), whereas v7 does provide this guarantee (and Linux/NaCl behave sanely). The patch keeps the -arm-strict-align command line option, and adds -arm-no-strict-align. They behave similarly to GCC's -mstrict-align and -mnostrict-align. I originally encountered this discrepancy in FastIsel tests which expect unaligned load/store generation. Overall this should slightly improve performance in most cases because of reduced I$ pressure. llvm-svn: 182175	2013-05-17 23:49:01 +00:00
Rafael Espindola	f5bb53f19f	Convert obj2yaml to use yamlio. llvm-svn: 182169	2013-05-17 22:58:42 +00:00
Vincent Lejeune	d3fcb5016c	R600: Lower int_load_input to copyFromReg instead of Register node It solves a bug uncovered by dot4 patch where the register class of int_load_input use was ignored. llvm-svn: 182130	2013-05-17 16:51:06 +00:00
Vincent Lejeune	3d5118ca40	R600: Use bottom up scheduling algorithm llvm-svn: 182129	2013-05-17 16:50:56 +00:00
Vincent Lejeune	4c81d4da6f	R600: Use depth first scheduling algorithm It should increase PV substitution opportunities and lower gpr usage (pending computations path are "flushed" sooner) llvm-svn: 182128	2013-05-17 16:50:44 +00:00
Vincent Lejeune	519f21eed3	R600: Relax some vector constraints on Dot4. Dot4 now uses 8 scalar operands instead of 2 vectors one which allows register coalescer to remove some unneeded COPY. This patch also defines some structures/functions that can be used to handle every vector instructions (CUBE, Cayman special instructions...) in a similar fashion. llvm-svn: 182126	2013-05-17 16:50:32 +00:00
Vincent Lejeune	d3eed66e8c	R600: Improve texture handling llvm-svn: 182125	2013-05-17 16:50:20 +00:00
Vincent Lejeune	4ebef18ab5	R600: Rename 128 bit registers. Almost all instructions that takes a 128 bits reg as input (fetch, export...) have the abilities to swizzle their argument and output. Instead of printing default swizzle for each 128 bits reg, rename T.XYZW to T and let instructions print potentially optimized swizzles themselves. llvm-svn: 182124	2013-05-17 16:50:09 +00:00
Tom Stellard	ecc2ad1cd4	R600: Fix encoding for R600 family GPUs Reviewed-by: Vincent Lejeune <vljn@ovi.com> https://bugs.freedesktop.org/show_bug.cgi?id=64193 https://bugs.freedesktop.org/show_bug.cgi?id=64257 https://bugs.freedesktop.org/show_bug.cgi?id=64320 NOTE: This is a candidate for the 3.3 branch. llvm-svn: 182113	2013-05-17 15:23:21 +00:00
Venkatraman Govindaraju	641b0b5a21	[Sparc] Implements hasReservedCallFrame and hasFP. This is to generate correct framesetup code when the function has variable sized allocas. llvm-svn: 182108	2013-05-17 15:14:34 +00:00
Benjamin Kramer	fc33e1d99b	X86: Make shuffle -> shift conversion more aggressive about undefs. Shuffles that only move an element into position 0 of the vector are common in the output of the loop vectorizer and often generate suboptimal code when SSSE3 is not available. Lower them to vector shifts if possible. We still prefer palignr over psrldq because it has higher throughput on sandybridge. llvm-svn: 182102	2013-05-17 14:48:34 +00:00
Benjamin Kramer	7ccd1b86bd	FileCheckize test. llvm-svn: 182101	2013-05-17 14:48:25 +00:00
Ulrich Weigand	6e23ac606e	[PowerPC] Merge/rename PPC fixup types Now that fixup_ppc_ha16 and fixup_ppc_lo16 are being treated exactly the same everywhere, it no longer makes sense to have two fixup types. This patch merges them both into a single type fixup_ppc_half16, and renames fixup_ppc_lo16_ds to fixup_ppc_half16ds for consistency. (The half16 and half16ds names are taken from the description of relocation types in the PowerPC ABI.) No change in code generation expected. llvm-svn: 182092	2013-05-17 12:37:21 +00:00
Ulrich Weigand	994f49ed79	[PowerPC] Fix processing of ha16/lo16 fixups The current PowerPC MC back end distinguishes between fixup_ppc_ha16 and fixup_ppc_lo16, which are determined by the instruction the fixup applies to, and uses this distinction to decide whether a fixup ought to resolve to the high or the low part of a symbol address. This isn't quite correct, however. It is valid -if unusual- assembler to use, e.g. li 1, symbol@ha or lis 1, symbol@l Whether the high or the low part of the address is used depends solely on the @ suffix, not on the instruction. In addition, both li 1, symbol and lis 1, symbol are valid, assuming the symbol address fits into 16 bits; again, both will then refer to the actual symbol value (so li will load the value itself, while lis will load the value shifted by 16). To fix this, two places need to be adapted. If the fixup cannot be resolved at assembler time, a relocation needs to be emitted via PPCELFObjectWriter::getRelocType. This routine already looks at the VK_ type to determine the relocation. The only problem is that will reject any _LO modifier in a ha16 fixup and vice versa. This is simply incorrect; any of those modifiers ought to be accepted for either fixup type. If the fixup can be resolved at assembler time, adjustFixupValue currently selects the high bits of the symbol value if the fixup type is ha16. Again, this is incorrect; see the above example lis 1, symbol Now, in theory we'd have to respect a VK_ modifier here. However, in fact common code never even attempts to resolve symbol references using any nontrivial VK_ modifier at assembler time; it will always fall back to emitting a reloc and letting the linker handle it. If this ever changes, presumably there'd have to be a target callback to resolve VK_ modifiers. We'd then have to handle @ha etc. there. llvm-svn: 182091	2013-05-17 12:36:29 +00:00
Venkatraman Govindaraju	54bf611c79	[Sparc] Prevent instructions that defines or uses %o7 to be in call's delay slot. llvm-svn: 182063	2013-05-16 23:53:29 +00:00
Adrian Prantl	9c93059aa4	Generate debug info for by-value struct args even if they are not used. radar://problem/13865940 llvm-svn: 182062	2013-05-16 23:44:12 +00:00
Akira Hatanaka	252f54f769	[mips] Improve instruction selection for pattern (store (fp_to_sint $src), $ptr). Previously, three instructions were needed: trunc.w.s $f0, $f2 mfc1 $4, $f0 sw $4, 0($2) Now we need only two: trunc.w.s $f0, $f2 swc1 $f0, 0($2) llvm-svn: 182053	2013-05-16 21:17:15 +00:00
Rafael Espindola	da5d100005	More test coverage for addFrameMove. llvm-svn: 182051	2013-05-16 20:50:56 +00:00
Hal Finkel	778c73c56c	Fix cpu on test CodeGen/PowerPC/ctrloop-fp64.ll We need ppc instead of generic to override native features on ppc machines. llvm-svn: 182049	2013-05-16 20:28:05 +00:00
Jack Carter	03f0fd37a9	Mips assembler: Add TwoOperandConstraint definitions This patch removes alias definition for addiu $rs,$imm and instead uses the TwoOperandAliasConstraint field in the ArithLogicI instruction class. This way all instructions that inherit ArithLogicI class have the same macro defined. The usage examples are added to test files. Patch by Vladimir Medic llvm-svn: 182048	2013-05-16 20:24:27 +00:00
Rafael Espindola	aed131d61d	More addFrameMove test coverage. llvm-svn: 182046	2013-05-16 20:00:45 +00:00
Hal Finkel	5f587c59a5	Create an new preheader in PPCCTRLoops to avoid counter register clobbers Some IR-level instructions (such as FP <-> i64 conversions) are not chained w.r.t. the mtctr intrinsic and yet may become function calls that clobber the counter register. At the selection-DAG level, these might be reordered with the mtctr intrinsic causing miscompiles. To avoid this situation, if an existing preheader has instructions that might use the counter register, create a new preheader for the mtctr intrinsic. This extra block will be remerged with the old preheader at the MI level, but will prevent unwanted reordering at the selection-DAG level. llvm-svn: 182045	2013-05-16 19:58:38 +00:00
Akira Hatanaka	fce4dd7974	[mips] Test case for r182042. Add comment. llvm-svn: 182044	2013-05-16 19:57:23 +00:00
Rafael Espindola	81250934d7	More test coverage for addFrameMove. llvm-svn: 182041	2013-05-16 19:44:40 +00:00
Jack Carter	51785c4715	Mips assembler: Add branch macro definitions This patch adds bnez and beqz instructions which represent alias definitions for bne and beq instructions as follows: bnez $rs,$imm => bne $rs,$zero,$imm beqz $rs,$imm => beq $rs,$zero,$imm The corresponding test cases are added. Patch by Vladimir Medic llvm-svn: 182040	2013-05-16 19:40:19 +00:00
Benjamin Kramer	fc88c3761f	DAGCombine: Also shrink eq compares where the constant is exactly as large as the smaller type. if ((x & 255) == 255) before: movzbl %al, %eax cmpl $255, %eax after: cmpb $-1, %al llvm-svn: 182038	2013-05-16 18:47:58 +00:00
Ulrich Weigand	9d980cbdb9	[PowerPC] Use true offset value in "memrix" machine operands This is the second part of the change to always return "true" offset values from getPreIndexedAddressParts, tackling the case of "memrix" type operands. This is about instructions like LD/STD that only have a 14-bit field to encode immediate offsets, which are implicitly extended by two zero bits by the machine, so that in effect we can access 16-bit offsets as long as they are a multiple of 4. The PowerPC back end currently handles such instructions by carrying the 14-bit value (as it will get encoded into the actual machine instructions) in the machine operand fields for such instructions. This means that those values are in fact not the true offset, but rather the offset divided by 4 (and then truncated to an unsigned 14-bit value). Like in the case fixed in r182012, this makes common code operations on such offset values not work as expected. Furthermore, there doesn't really appear to be any strong reason why we should encode machine operands this way. This patch therefore changes the encoding of "memrix" type machine operands to simply contain the "true" offset value as a signed immediate value, while enforcing the rules that it must fit in a 16-bit signed value and must also be a multiple of 4. This change must be made simultaneously in all places that access machine operands of this type. However, just about all those changes make the code simpler; in many cases we can now just share the same code for memri and memrix operands. llvm-svn: 182032	2013-05-16 17:58:02 +00:00
Hal Finkel	47db66d43f	PPC32 cannot form counter loops around i64 FP conversions On PPC32, i64 FP conversions are implemented using runtime calls (which clobber the counter register). These must be excluded. llvm-svn: 182023	2013-05-16 16:52:41 +00:00
Rafael Espindola	eb03f8a75f	Add a triple to the test to try to fix the windows bots. llvm-svn: 182022	2013-05-16 16:48:46 +00:00
Rafael Espindola	8174c8cc68	More addFrameMove test coverage. llvm-svn: 182021	2013-05-16 16:34:38 +00:00
Bill Schmidt	22f9191979	Use new CHECK-DAG support to stabilize CodeGen/PowerPC/recipest.ll While testing some experimental code to add vector-scalar registers to PowerPC, I noticed that a couple of independent instructions were flipped by the scheduler. The new CHECK-DAG support is perfect for avoiding this problem. llvm-svn: 182020	2013-05-16 16:15:18 +00:00
Rafael Espindola	12adfd8e23	Add more addFrameMove test coverage. llvm-svn: 182019	2013-05-16 16:09:54 +00:00
Rafael Espindola	c6b7383bda	Add more test coverage for addFrameMove. llvm-svn: 182017	2013-05-16 15:18:50 +00:00
Ulrich Weigand	7aa76b6a07	[PowerPC] Report true displacement value from getPreIndexedAddressParts DAGCombiner::CombineToPreIndexedLoadStore calls a target routine to decompose a memory address into a base/offset pair. It expects the offset (if constant) to be the true displacement value in order to perform optional additional optimizations; in particular, to convert other uses of the original pointer into uses of the new base pointer after pre-increment. The PowerPC implementation of getPreIndexedAddressParts, however, simply calls SelectAddressRegImm, which returns a TargetConstant. This value is appropriate for encoding into the instruction, but it is not always usable as true displacement value: - Its type is always MVT::i32, even on 64-bit, where addresses ought to be i64 ... this causes the optimization to simply always fail on 64-bit due to this line in DAGCombiner: // FIXME: In some cases, we can be smarter about this. if (Op1.getValueType() != Offset.getValueType()) { - Its value is truncated to an unsigned 16-bit value if negative. This causes the above opimization to generate wrong code. This patch fixes both problems by simply returning the true displacement value (in its original type). This doesn't affect any other user of the displacement. llvm-svn: 182012	2013-05-16 14:53:05 +00:00
Rafael Espindola	1f2584025d	Add more addFrameMove test coverage. llvm-svn: 182011	2013-05-16 14:51:26 +00:00
Rafael Espindola	c533b559d0	Extend test to check the .cfi instructions. I am about to refactor the calls to addFrameMove and some of the ppc ones were not being tested. llvm-svn: 182009	2013-05-16 14:30:09 +00:00
Benjamin Kramer	41942fb37a	Relax CHECK-NEXTs a bit to cope with atom's return nop padding. llvm-svn: 181999	2013-05-16 11:46:50 +00:00
Evgeniy Stepanov	1e7643243d	[msan] Switch TLS globals to initial-exec model. They are always defined in the main executable. llvm-svn: 181994	2013-05-16 09:14:05 +00:00
Richard Smith	e04f0d34d1	Respect the 'nobuiltin' attribute when determining if a call is to a memory builtin. llvm-svn: 181978	2013-05-16 04:12:04 +00:00
Rafael Espindola	a5c7ceedb5	Extend test for better coverage. Without this change nothing was covering this addFrameMove: // For 64-bit SVR4 when we have spilled CRs, the spill location // is SP+8, not a frame-relative slot. if (Subtarget.isSVR4ABI() && Subtarget.isPPC64() && (PPC::CR2 <= Reg && Reg <= PPC::CR4)) { MachineLocation CSDst(PPC::X1, 8); MachineLocation CSSrc(PPC::CR2); MMI.addFrameMove(Label, CSDst, CSSrc); continue; } llvm-svn: 181976	2013-05-16 03:48:50 +00:00
Reed Kotler	515e937685	Patch number 2 for mips16/32 floating point interoperability stubs. This creates stubs that help Mips32 functions call Mips16 functions which have floating point parameters that are normally passed in floating point registers. llvm-svn: 181972	2013-05-16 02:17:42 +00:00
David Majnemer	37d3825022	Set an explicit triple for this test. This allows the test to correctly check symbol names. llvm-svn: 181939	2013-05-15 22:23:21 +00:00
David Majnemer	8f16974273	X86: Remove redundant test instructions Increase the number of instructions LLVM recognizes as setting the ZF flag. This allows us to remove test instructions that redundantly recalculate the flag. llvm-svn: 181937	2013-05-15 22:03:08 +00:00
Hal Finkel	25c1992bc7	Implement PPC counter loops as a late IR-level pass The old PPCCTRLoops pass, like the Hexagon pass version from which it was derived, could only handle some simple loops in canonical form. We cannot directly adapt the new Hexagon hardware loops pass, however, because the Hexagon pass contains a fundamental assumption that non-constant-trip-count loops will contain a guard, and this is not always true (the result being that incorrect negative counts can be generated). With this commit, we replace the pass with a late IR-level pass which makes use of SE to calculate the backedge-taken counts and safely generate the loop-count expressions (including any necessary max() parts). This IR level pass inserts custom intrinsics that are lowered into the desired decrement-and-branch instructions. The most fragile part of this new implementation is that interfering uses of the counter register must be detected on the IR level (and, on PPC, this also includes any indirect branches in addition to function calls). Also, to make all of this work, we need a variant of the mtctr instruction that is marked as having side effects. Without this, machine-code level CSE, DCE, etc. illegally transform the resulting code. Hopefully, this can be improved in the future. This new pass is smaller than the original (and much smaller than the new Hexagon hardware loops pass), and can handle many additional cases correctly. In addition, the preheader-creation code has been copied from LoopSimplify, and after we decide on where it belongs, this code will be refactored so that it can be explicitly shared (making this implementation even smaller). The new test-case files ctrloop-{le,lt,ne}.ll have been adapted from tests for the new Hexagon pass. There are a few classes of loops that this pass does not transform (noted by FIXMEs in the files), but these deficiencies can be addressed within the SE infrastructure (thus helping many other passes as well). llvm-svn: 181927	2013-05-15 21:37:41 +00:00
Derek Schuff	d2c42d766d	Fix miscompile due to StackColoring incorrectly merging stack slots (PR15707) IR optimisation passes can result in a basic block that contains: llvm.lifetime.start(%buf) ... llvm.lifetime.end(%buf) ... llvm.lifetime.start(%buf) Before this change, calculateLiveIntervals() was ignoring the second lifetime.start() and was regarding %buf as being dead from the lifetime.end() through to the end of the basic block. This can cause StackColoring to incorrectly merge %buf with another stack slot. Fix by removing the incorrect Starts[pos].isValid() and Finishes[pos].isValid() checks. Just doing: Starts[pos] = Indexes->getMBBStartIdx(MBB); Finishes[pos] = Indexes->getMBBEndIdx(MBB); unconditionally would be enough to fix the bug, but it causes some test failures due to stack slots not being merged when they were before. So, in order to keep the existing tests passing, treat LiveIn and LiveOut separately rather than approximating the live ranges by merging LiveIn and LiveOut. This fixes PR15707. Patch by Mark Seaborn. llvm-svn: 181922	2013-05-15 21:15:09 +00:00
Ulrich Weigand	2fb140ef31	[PowerPC] Remove need for adjustFixupOffst hack Now that applyFixup understands differently-sized fixups, we can define fixup_ppc_lo16/fixup_ppc_lo16_ds/fixup_ppc_ha16 to properly be 2-byte fixups, applied at an offset of 2 relative to the start of the instruction text. This has the benefit that if we actually need to generate a real relocation record, its address will come out correctly automatically, without having to fiddle with the offset in adjustFixupOffset. Tested on both 64-bit and 32-bit PowerPC, using external and integrated assembler. llvm-svn: 181894	2013-05-15 15:07:06 +00:00
Richard Sandiford	ffd144174d	[SystemZ] Make use of SUBTRACT HALFWORD Thanks to Ulrich Weigand for noticing that this instruction was missing. llvm-svn: 181893	2013-05-15 15:05:29 +00:00
Ulrich Weigand	e7050ad0a1	[PowerPC] Add test case for r181891 llvm-svn: 181892	2013-05-15 15:02:12 +00:00
Richard Sandiford	78a8ef87ca	[SystemZ] Consolidate disassembler tests for valid input into 2 big tests llvm-svn: 181879	2013-05-15 11:00:31 +00:00
Richard Sandiford	364d821ebc	[SystemZ] Consolidate assembler tests into 4 big tests llvm-svn: 181878	2013-05-15 09:58:19 +00:00
Arnold Schwaighofer	2d920477a4	LoopVectorize: Hoist conditional loads if possible InstCombine can be uncooperative to vectorization and sink loads into conditional blocks. This prevents vectorization. Undo this optimization if there are unconditional memory accesses to the same addresses in the loop. radar://13815763 llvm-svn: 181860	2013-05-15 01:44:30 +00:00
Ahmed Bougacha	9dab0cc6c3	Object: Fix Mach-O relocation printing. There were two problems that made llvm-objdump -r crash: - for non-scattered relocations, the symbol/section index is actually in the (aptly named) symbolnum field. - sections are 1-indexed. llvm-svn: 181843	2013-05-14 22:41:29 +00:00
Arnold Schwaighofer	af85f6083a	ARM ISel: Don't create illegal types during LowerMUL The transformation happening here is that we want to turn a "mul(ext(X), ext(X))" into a "vmull(X, X)", stripping off the extension. We have to make sure that X still has a valid vector type - possibly recreate an extension to a smaller type. In case of a extload of a memory type smaller than 64 bit we used create a ext(load()). The problem with doing this - instead of recreating an extload - is that an illegal type is exposed. This patch fixes this by creating extloads instead of ext(load()) sequences. Fixes PR15970. radar://13871383 llvm-svn: 181842	2013-05-14 22:33:24 +00:00
Manman Ren	b3c52fb45b	GlobalOpt: fix an issue where CXAAtExitFn points to a deleted function. CXAAtExitFn was set outside a loop and before optimizations where functions can be deleted. This patch will set CXAAtExitFn inside the loop and after optimizations. Seg fault when running LTO because of accesses to a deleted function. rdar://problem/13838828 llvm-svn: 181838	2013-05-14 21:52:44 +00:00
Michael Liao	91a1b2c9eb	Add 'CHECK-DAG' support Refer to 'FileCheck.rst'f for details of 'CHECK-DAG'. llvm-svn: 181827	2013-05-14 20:34:12 +00:00
Bill Schmidt	a87a7e2620	Implement the PowerPC system call (sc) instruction. Instruction added at request of Roman Divacky. Tested via asm-parser. llvm-svn: 181821	2013-05-14 19:35:45 +00:00
Jyotsna Verma	803e506fec	Hexagon: Pass to replace tranfer/copy instructions into combine instruction where possible. llvm-svn: 181817	2013-05-14 18:54:06 +00:00
Eric Christopher	b27cd8bea6	Reapply "Subtract isn't commutative, fix this for MMX psub." with a somewhat randomly chosen cpu that will minimize cpu specific differences on bots. llvm-svn: 181814	2013-05-14 18:33:40 +00:00
Eric Christopher	3eee7454cf	Temporarily revert "Subtract isn't commutative, fix this for MMX psub." It's causing failures on the atom bot. llvm-svn: 181812	2013-05-14 18:20:42 +00:00
Eric Christopher	0344f495f9	Subtract isn't commutative, fix this for MMX psub. Patch by Andrea DiBiagio. llvm-svn: 181809	2013-05-14 17:52:05 +00:00
Jakob Stoklund Olesen	abc3d23ccb	Recognize sparc64 as an alias for sparcv9 triples. Patch by Brad Smith! llvm-svn: 181808	2013-05-14 17:47:27 +00:00
Jyotsna Verma	2dca82ad1c	Hexagon: Add patterns to generate 'combine' instructions. llvm-svn: 181805	2013-05-14 17:16:38 +00:00
Jyotsna Verma	11bd54afd6	Hexagon: ArePredicatesComplement should not restrict itself to TFRs. llvm-svn: 181803	2013-05-14 16:36:34 +00:00
Derek Schuff	bd7c6e5015	Fix ARM FastISel tests, as a first step to enabling ARM FastISel ARM FastISel is currently only enabled for iOS non-Thumb1, and I'm working on enabling it for other targets. As a first step I've fixed some of the tests. Changes to ARM FastISel tests: - Different triples don't generate the same relocations (especially movw/movt versus constant pool loads). Use a regex to allow either. - Mangling is different. Use a regex to allow either. - The reserved registers are sometimes different, so registers get allocated in a different order. Capture the names only where this occurs. - Add -verify-machineinstrs to some tests where it works. It doesn't work everywhere it should yet. - Add -fast-isel-abort to many tests that didn't have it before. - Split out the VarArg test from fast-isel-call.ll into its own test. This simplifies test setup because of --check-prefix. Patch by JF Bastien llvm-svn: 181801	2013-05-14 16:26:38 +00:00
Bill Schmidt	ef3d1a24ed	PPC32: Fix stack collision between FP and CR save areas. The changes to CR spill handling missed a case for 32-bit PowerPC. The code in PPCFrameLowering::processFunctionBeforeFrameFinalized() checks whether CR spill has occurred using a flag in the function info. This flag is only set by storeRegToStackSlot and loadRegFromStackSlot. spillCalleeSavedRegisters does not call storeRegToStackSlot, but instead produces MI directly. Thus we don't see the CR is spilled when assigning frame offsets, and the CR spill ends up colliding with some other location (generally the FP slot). This patch sets the flag in spillCalleeSavedRegisters for PPC32 so that the CR spill is properly detected and gets its own slot in the stack frame. llvm-svn: 181800	2013-05-14 16:08:32 +00:00
Jyotsna Verma	7dcbb96e26	Hexagon: Test case to check if branch probabilities are properly reflected in the jump instructions in the form of taken/not-taken hint. llvm-svn: 181799	2013-05-14 15:50:49 +00:00
Richard Sandiford	eb9af29426	[SystemZ] Add disassembler support llvm-svn: 181777	2013-05-14 10:17:52 +00:00
Michel Danzer	1290369f7b	R600/SI: Add lit test coverage for the remaining patterns added recently Reviewed-by: Christian König <christian.koenig@amd.com> llvm-svn: 181775	2013-05-14 09:53:30 +00:00
Richard Sandiford	18272f8490	[SystemZ] Add extra testscases for r181773 Forgot to svn add these... llvm-svn: 181774	2013-05-14 09:49:11 +00:00
Richard Sandiford	1fb5883d77	[SystemZ] Rework handling of constant PC-relative operands The GNU assembler treats things like: brasl %r14, 100 in the same way as: brasl %r14, .+100 rather than as a branch to absolute address 100. We implemented this in LLVM by creating an immediate operand rather than the usual expr operand, and by handling immediate operands specially in the code emitter. This was undesirable for (at least) three reasons: - the specialness of immediate operands was exposed to the backend MC code, rather than being limited to the assembler parser. - in disassembly, an immediate operand really is an absolute address. (Note that this means reassembling printed disassembly can't recreate the original code.) - it would interfere with any assembly manipulation that we might try in future. E.g. operations like branch shortening can change the relative position of instructions, but any code that updates sym+offset addresses wouldn't update an immediate "100" operand in the same way as an explicit ".+100" operand. This patch changes the implementation so that the assembler creates a "." label for immediate PC-relative operands, so that the operand to the MCInst is always the absolute address. The patch also adds some error checking of the offset. llvm-svn: 181773	2013-05-14 09:47:26 +00:00
Reed Kotler	2c4657d9b7	This is the first of three patches which creates stubs used for Mips16/32 floating point interoperability. When Mips16 code calls external functions that would normally have some of its parameters or return values passed in floating point registers, it needs (Mips32) helper functions to do this because while in Mips16 mode there is no ability to access the floating point registers. In Pic mode, this is done with a set of predefined functions in libc. This case is already handled in llvm for Mips16. In static relocation mode, for efficiency reasons, the compiler generates stubs that the linker will use if it turns out that the external function is a Mips32 function. (If it's Mips16, then it does not need the helper stubs). These stubs are identically named and the linker knows about these tricks and will not create multiple copies and will delete them if they are not needed. llvm-svn: 181753	2013-05-14 02:00:24 +00:00
Akira Hatanaka	1f24e6a6a2	StackColoring: don't clear an instruction's mem operand if the underlying object is a PseudoSourceValue and PseudoSourceValue::isConstant returns true (i.e., points to memory that has a constant value). llvm-svn: 181751	2013-05-14 01:42:44 +00:00
Arnold Schwaighofer	2e7a922a15	LoopVectorize: Handle loops with multiple forward inductions We used to give up if we saw two integer inductions. After this patch, we base further induction variables on the chosen one like we do in the reverse induction and pointer induction case. Fixes PR15720. radar://13851975 llvm-svn: 181746	2013-05-14 00:21:18 +00:00
Michael Gottesman	a76143eeee	[objc-arc-opts] In the presense of an alloca unconditionally remove RR pairs if and only if we are both KnownSafeBU/KnownSafeTD rather than just either or. In the presense of a block being initialized, the frontend will emit the objc_retain on the original pointer and the release on the pointer loaded from the alloca. The optimizer will through the provenance analysis realize that the two are related (albiet different), but since we only require KnownSafe in one direction, will match the inner retain on the original pointer with the guard release on the original pointer. This is fixed by ensuring that in the presense of allocas we only unconditionally remove pointers if both our retain and our release are KnownSafe (i.e. we are KnownSafe in both directions) since we must deal with the possibility that the frontend will emit what (to the optimizer) appears to be unbalanced retain/releases. An example of the miscompile is: %A = alloca retain(%x) retain(%x) <--- Inner Retain store %x, %A %y = load %A ... DO STUFF ... release(%y) call void @use(%x) release(%x) <--- Guarding Release getting optimized to: %A = alloca retain(%x) store %x, %A %y = load %A ... DO STUFF ... release(%y) call void @use(%x) rdar://13750319 llvm-svn: 181743	2013-05-13 23:49:42 +00:00
Jack Carter	f5f48d8ff7	Mips assembler: Assembler macro ADDIU $rs,imm This patch adds alias for addiu instruction which enables following syntax: addiu $rs,imm The macro is translated as: addiu $rs,$rs,imm Contributer: Vladimir Medic llvm-svn: 181729	2013-05-13 20:26:46 +00:00
Bill Schmidt	22d40dcfe9	PPC64: Constant initializers with dynamic relocations go in .data.rel.ro. This fixes warning messages observed in the oggenc application test in projects/test-suite. Special handling is needed for the 64-bit PowerPC SVR4 ABI when a constant is initialized with a pointer to a function in a shared library. Because a function address is implemented as the address of a function descriptor, the use of copy relocations can lead to problems with initialization. GNU ld therefore replaces copy relocations with dynamic relocations to be resolved by the dynamic linker. This means the constant cannot reside in the read-only data section, but instead belongs in .data.rel.ro, which is designed for constants containing dynamic relocations. The implementation creates a class PPC64LinuxTargetObjectFile inheriting from TargetLoweringObjectFileELF, which behaves like its parent except to place constants of this sort into .data.rel.ro. The test case is reduced from the oggenc application. llvm-svn: 181723	2013-05-13 19:34:37 +00:00
Akira Hatanaka	9edae02db8	[mips] Add option -mno-ldc1-sdc1. This option is used when the user wants to avoid emitting double precision FP loads and stores. Double precision FP loads and stores are expanded to single precision instructions after register allocation. llvm-svn: 181718	2013-05-13 18:23:35 +00:00
Mihai Popa	dc1764c5a4	The purpose of the patch is to fix the syntax of ARM mrc and mrc2 instructions when they are used to write to the APSR. In this case, the destination operand should be APSR_nzcv, and the encoding of the target should be 0b1111 (same as for PC). In pre-UAL syntax, this form used the PC register as a textual target. This is still allowed for backward compatibility. llvm-svn: 181705	2013-05-13 14:10:04 +00:00
Lang Hames	67c09b3f88	Correctly preserve the input chain for potential tailcall nodes whose return values are bitcasts. The chain had previously been being clobbered with the entry node to the dag, which sometimes caused other code in the function to be erroneously deleted when tailcall optimization kicked in. <rdar://problem/13827621> llvm-svn: 181696	2013-05-13 10:21:19 +00:00
Hao Liu	bc60196951	Fix PR15950 A bug in DAG Combiner about undef mask llvm-svn: 181682	2013-05-13 02:07:05 +00:00
Rafael Espindola	65c016b106	XFAIL this test for mingw too. llvm-svn: 181678	2013-05-13 00:18:24 +00:00
Nadav Rotem	ce42cc6d4d	SLPVectorizer: Fix a bug in the code that generates extracts for values with multiple users. The external user does not have to be in lane #0. We have to save the lane for each scalar so that we know which vector lane to extract. llvm-svn: 181674	2013-05-12 22:58:45 +00:00
David Majnemer	6c30f49af3	InstCombine: Flip the order of two urem transforms There are two transforms in visitUrem that conflict with each other. ) One, if a divisor is a power of two, subtracts one from the divisor and turns it into a bitwise-and. ) The other unwraps both operands if they are surrounded by zext instructions. Flipping the order allows the subtraction to go beneath the sign extension. llvm-svn: 181668	2013-05-12 00:07:05 +00:00
Arnold Schwaighofer	f2305e4467	LoopVectorize: Use the widest induction variable type Use the widest induction type encountered for the cannonical induction variable. We used to turn the following loop into an empty loop because we used i8 as induction variable type and truncated 1024 to 0 as trip count. int a[1024]; void fail() { int reverse_induction = 1023; unsigned char forward_induction = 0; while ((reverse_induction) >= 0) { forward_induction++; a[reverse_induction] = forward_induction; --reverse_induction; } } radar://13862901 llvm-svn: 181667	2013-05-11 23:04:28 +00:00
David Majnemer	470b077bca	InstCombine: Turn urem to bitwise-and more often Use isKnownToBeAPowerOfTwo in visitUrem so that we may more aggressively fold away urem instructions. llvm-svn: 181661	2013-05-11 09:01:28 +00:00
Reed Kotler	739d36a265	Add -mtriple=mipsel-linux-gnu to the test so that the compiler does not think it can support small data sections. llvm-svn: 181654	2013-05-11 01:02:20 +00:00
Nadav Rotem	cdfb48d2fe	SLPVectorizer: Add support for trees with external users. For example: bar() { int a = A[i]; int b = A[i+1]; B[i] = a; B[i+1] = b; foo(a); <--- a is used outside the vectorized expression. } llvm-svn: 181648	2013-05-10 22:59:33 +00:00
Nadav Rotem	5ff6db1153	Add an additional testcase for PR15882. llvm-svn: 181646	2013-05-10 22:55:44 +00:00
Reed Kotler	783c79446b	Checkin in of first of several patches to finish implementation of mips16/mips32 floating point interoperability. This patch fixes returns from mips16 functions so that if the function was in fact called by a mips32 hard float routine, then values that would have been returned in floating point registers are so returned. Mips16 mode has no floating point instructions so there is no way to load values into floating point registers. This is needed when returning float, double, single complex, double complex in the Mips ABI. Helper functions in libc for mips16 are available to do this. For efficiency purposes, these helper functions have a different calling convention from normal Mips calls. Registers v0,v1,a0,a1 are used to pass parameters instead of a0,a1,a2,a3. This is because v0,v1,a0,a1 are the natural registers used to return floating point values in soft float. These values can then be moved to the appropriate floating point registers with no extra cost. The only register that is modified is ra in this call. The helper functions make sure that the return values are in the floating point registers that they would be in if soft float was not in effect (which it is for mips16, though the soft float is implemented using a mips32 library that uses hard float). llvm-svn: 181641	2013-05-10 22:25:39 +00:00
David Blaikie	f95485a69e	Give the test from r181632 a target triple. llvm-svn: 181637	2013-05-10 22:14:39 +00:00
David Blaikie	a1e813dcd4	PR14492: Debug Info: Support for values of non-integer non-type template parameters. This is only tested for global variables at the moment (& includes tests for the unnamed parameter case, since apparently this entire function was completely untested previously) llvm-svn: 181632	2013-05-10 21:52:07 +00:00
Jyotsna Verma	300f0b966c	Hexagon: Fix switch cases in HexagonVLIWPacketizer.cpp. llvm-svn: 181624	2013-05-10 20:27:34 +00:00
Chad Rosier	c8569cba93	[ms-inline asm] Fix a crasher when we fail on a direct match. The issue was that the MatchingInlineAsm and VariantID args to the MatchInstructionImpl function weren't being set properly. Specifically, when parsing intel syntax, the parser thought it was parsing inline assembly in the at&t dialect; that will never be the case. The crash was caused when the emitter tried to emit the instruction, but the operands weren't set. When parsing inline assembly we only set the opcode, not the operands, which is used to lookup the instruction descriptor. rdar://13854391 and PR15945 Also, this commit reverts r176036. Now that we're correctly parsing the intel syntax the pushad/popad don't match properly. I've reimplemented that fix using a MnemonicAlias. llvm-svn: 181620	2013-05-10 18:24:17 +00:00
Benjamin Kramer	14e915f7b4	InstCombine: Don't claim to be able to evaluate any shl in a zexted type. The shift amount may be larger than the type leading to undefined behavior. Limit the transform to constant shift amounts. While there update the bits to clear in the result which may enable additional optimizations. PR15959. llvm-svn: 181604	2013-05-10 16:26:37 +00:00
Logan Chien	4ea23b56c5	Implement AsmParser for ARM unwind directives. This commit implements the AsmParser for fnstart, fnend, cantunwind, personality, handlerdata, pad, setfp, save, and vsave directives. This commit fixes some minor issue in the ARMELFStreamer: * The switch back to corresponding section after the .fnend directive. * Emit the unwind opcode while processing .fnend directive if there is no .handlerdata directive. * Emit the unwind opcode to .ARM.extab while processing .handlerdata even if .personality directive does not exist. llvm-svn: 181603	2013-05-10 16:17:24 +00:00
Aaron Ballman	e42ccf32cc	XFAILing this test on Win32 to unbreak the build bots. llvm-svn: 181600	2013-05-10 14:42:16 +00:00
Benjamin Kramer	a5d59333b3	DAGCombiner: Generate a correct constant for vector types when folding (xor (and)) into (and (not)). PR15948. llvm-svn: 181597	2013-05-10 14:09:52 +00:00
Benjamin Kramer	a6645e8b8f	InstCombine: Verify the type before transforming uitofp into select. PR15952. llvm-svn: 181586	2013-05-10 09:16:52 +00:00
Tom Stellard	2b971eb0d0	R600: Remove AMDILPeeopholeOptimizer and replace optimizations with tablegen patterns The BFE optimization was the only one we were actually using, and it was emitting an intrinsic that we don't support. https://bugs.freedesktop.org/show_bug.cgi?id=64201 Reviewed-by: Christian König <christian.koenig@amd.com> NOTE: This is a candidate for the 3.3 branch. llvm-svn: 181580	2013-05-10 02:09:45 +00:00
Tom Stellard	3a7c34c778	R600: Expand SUB for v2i32/v4i32 Patch by: Aaron Watry Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry <awatry@gmail.com> NOTE: This is a candidate for the 3.3 branch. llvm-svn: 181579	2013-05-10 02:09:39 +00:00
Tom Stellard	3deddc5079	R600: Expand MUL for v4i32/v2i32 Fixes piglit test for OpenCL builtin mul24, and allows mad24 to run. Patch by: Aaron Watry Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry <awatry@gmail.com> NOTE: This is a candidate for the 3.3 branch. llvm-svn: 181578	2013-05-10 02:09:34 +00:00
Tom Stellard	7fb3963498	R600: Expand SRA for v4i32/v2i32 v2: Add v4i32 test Patch by: Aaron Watry Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry <awatry@gmail.com> NOTE: This is a candidate for the 3.3 branch. llvm-svn: 181577	2013-05-10 02:09:29 +00:00
Tom Stellard	a99c6ae47a	R600: Expand vselect for v4i32 and v2i32 v2: Add vselect v4i32 test Patch by: Aaron Watry Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry <awatry@gmail.com> NOTE: This is a candidate for the 3.3 branch. llvm-svn: 181576	2013-05-10 02:09:24 +00:00
Chad Rosier	edb1dc8498	[x86AsmParser] It's valid to stop parsing an operand at an immediate. rdar://13854369 and PR15944 llvm-svn: 181564	2013-05-09 23:48:53 +00:00
Owen Anderson	32baf99b1d	Teach SelectionDAG to constant fold all-constant FMA nodes the same way that it constant folds FADD, FMUL, etc. llvm-svn: 181555	2013-05-09 22:27:13 +00:00
Bill Wendling	07fe235e2b	Generate a compact unwind encoding in the face of a stack alignment push. We generate a `push' of a random register (%rax) if the stack needs to be aligned by the size of that register. However, this could mess up compact unwind generation. In particular, we want to still generate compact unwind in the presence of this monstrosity. Check if the push of of the %rax/%eax register. If it is and it's marked with the `FrameSetup' flag, then we can generate a compact unwind encoding for the function only if the push is the last FrameSetup instruction. llvm-svn: 181540	2013-05-09 20:10:38 +00:00
Jyotsna Verma	978e972ff9	Hexagon: Use relation map for getMatchingCondBranchOpcode() and getInvertedPredicatedOpcode() functions instead of switch cases. llvm-svn: 181530	2013-05-09 18:25:44 +00:00
Rafael Espindola	007521673b	Don't replace an alias in llvm.used with its target. When we replace an internal alias with its target, be careful not to replace the entry in llvm.used (and llvm.compiler_used). llvm-svn: 181524	2013-05-09 17:22:59 +00:00
Richard Osborne	1333fa3d68	[XCore] Fix handling of functions where only the LR is spilled. Previously we only checked if the LR required saving if the frame size was non zero. However because the caller reserves 1 word for the callee to use that doesn't count towards our frame size it is possible for the LR to need saving and for the frame size to be 0. We didn't hit when the LR needed saving because of a function calls because the 1 word of stack we must allocate for our callee means the frame size is always non zero in this case. However we can hit this case if the LR is clobbered in inline asm. llvm-svn: 181520	2013-05-09 16:43:42 +00:00
Benjamin Kramer	21b972ae94	InstCombine: Don't just copy known bits from the first operand of an srem. That's obviously wrong. Conservatively restrict it to the sign bit, which matches the original intention of this analysis. Fixes PR15940. llvm-svn: 181518	2013-05-09 16:32:32 +00:00
Rafael Espindola	0d15f7313f	Change getRelocationAdditionalInfo to be ELF only. It was only implemented for ELF where it collected the Addend, so this patch also renames it to getRelocationAddend. llvm-svn: 181502	2013-05-09 03:39:05 +00:00
Eric Christopher	f20ff979e9	Revert "Make sure debug info contains linkage names (DW_AT_MIPS_linkage_name)" temporarily while investigating gdb.cp/templates.exp. This reverts commit r181471. llvm-svn: 181496	2013-05-09 00:42:33 +00:00
Arnold Schwaighofer	2e8c69cf97	LoopVectorizer: Don't assert on the absence of induction variables A computable loop exit count does not imply the presence of an induction variable. Scalar evolution can return a value for an infinite loop. Fixes PR15926. llvm-svn: 181495	2013-05-09 00:32:18 +00:00
Daniel Malea	abdd7f197a	Revert 181475 as the DebugIR tests are breaking (automake) buildbots that re-use build dirs - the temporaries "-debug.ll" files generated by DebugIR pass are considered tests, even though they are not llvm-svn: 181476	2013-05-08 21:55:31 +00:00
Eric Christopher	697fa1c8be	Make sure debug info contains linkage names (DW_AT_MIPS_linkage_name) for constructors and destructors since the original declaration given by the AT_specification both won't and can't. Patch by Yacine Belkadi, I've cleaned up the testcases. llvm-svn: 181471	2013-05-08 21:23:22 +00:00
Daniel Malea	0233db70f0	DebugIR tests -- lit tests for the line number transform - simple one-function case - function-calling case - external function calling case - exception throwing case - vector case Note: these tests are somewhat coupled to the current format of debug metadata. llvm-svn: 181469	2013-05-08 21:03:00 +00:00
Akira Hatanaka	b4526ea132	[mips] Add instruction selection pattern for (seteq $LHS, 0). llvm-svn: 181459	2013-05-08 19:38:04 +00:00
Ulrich Weigand	689e34a824	[PowerPC] Add ELF relocation tests This patch extends test/MC/PowerPC/ppc64-fixups.s to not only check for the correct fixup type in the --show-encoding output, but also runs the generated object file through llvm-readobj -r and verifies that the correct ELF relocation records were generated. llvm-svn: 181453	2013-05-08 17:51:44 +00:00
Bill Schmidt	38b6cb51bc	Fix handling of anonymous aggregate parameters for powerpc*-apple-darwin8. This fixes bug 15821 similarly to the powerpc64-linux fix for bug 14779. Patch by David Fang. llvm-svn: 181449	2013-05-08 17:22:33 +00:00
Michel Danzer	7bbd7aa7fe	R600/SI: Add lit tests for llvm.SI.imageload and llvm.SI.resinfo intrinsics Adapted from the llvm.SI.sample test. Reviewed-by: Christian König <christian.koenig@amd.com> llvm-svn: 181425	2013-05-08 13:07:29 +00:00
Hal Finkel	08e53ee551	PPCInstrInfo::optimizeCompareInstr should not optimize FP compares The floating-point record forms on PPC don't set the condition register bits based on a comparison with zero (like the integer record forms do), but rather based on the exception status bits. llvm-svn: 181423	2013-05-08 12:16:14 +00:00
Mihai Popa	1fb61c6eed	This patch fixes two tests marked as XFAIL among the ARM assembler tests. The reference encoding is correct, but written in the wrong byte order (these are Thumb tests, while the reference is in ARM byte order). llvm-svn: 181420	2013-05-08 09:41:12 +00:00
Nick Lewycky	5fb1963f2a	Fix a bug in codegenprep where it was losing track of values OptimizeMemoryInst by switching to a ValueMap. Patch by Andrea DiBiagio! llvm-svn: 181397	2013-05-08 09:00:10 +00:00
David Majnemer	386ab7f872	DAGCombiner: Simplify inverted bit tests Fold (xor (and x, y), y) -> (and (not x), y) This removes an opportunity for a constant to appear twice. llvm-svn: 181395	2013-05-08 06:44:42 +00:00
David Blaikie	3b6038b6f3	Debug Info: Support DW_TAG_imported_declaration This provides basic functionality for imported declarations. For subprograms and types some amount of lazy construction is supported (so the definition of a function can proceed the using declaration), but it still doesn't handle declared-but-not-defined functions (since we don't generally emit function declarations). Variable support is really rudimentary at the moment - simply looking up the existing definition with no support for out of order (declaration, imported_module, then definition). llvm-svn: 181392	2013-05-08 06:01:41 +00:00
Arnold Schwaighofer	3610139ac5	LoopVectorizer: Improve reduction variable identification The two nested loops were confusing and also conservative in identifying reduction variables. This patch replaces them by a worklist based approach. llvm-svn: 181369	2013-05-07 21:55:37 +00:00
Kevin Enderby	ca08df756e	Fix a bug in the MC asm parser evaluating expressions. It was treating: A = 9 B = 3 * A - 2 * A + 1 as B = 3 * A - (2 * A + 1) rdar://13816516 llvm-svn: 181366	2013-05-07 21:40:58 +00:00
Jyotsna Verma	5eb598001c	Hexagon: Fix Small Data support to handle -G 0 correctly. llvm-svn: 181344	2013-05-07 19:53:00 +00:00
David Blaikie	6baa776173	Debug Info: Fix for break due to r181271 Apparently we didn't keep an association of Compile Unit metadata nodes to DIEs so looking up that parental context failed & thus caused no DW_TAG_imported_modules to be emitted at the CU scope. Fix this by adding the mapping & sure up the test case to verify this. llvm-svn: 181339	2013-05-07 17:57:13 +00:00
Jyotsna Verma	03c6ca905c	Reverting r181331. Missing file, HexagonSplitConst32AndConst64.cpp, from lib/Target/Hexagon/CMakeLists.txt. llvm-svn: 181334	2013-05-07 17:12:35 +00:00
Jyotsna Verma	19f0b40dcf	Hexagon: Fix Small Data support to handle -G 0 correctly. llvm-svn: 181331	2013-05-07 16:42:15 +00:00
Arnold Schwaighofer	e78b76fbed	LoopVectorize: getConsecutiveVector must respect signed arithmetic We were passing an i32 to ConstantInt::get where an i64 was needed and we must also pass the sign if we pass negatives numbers. The start index passed to getConsecutiveVector must also be signed. Should fix PR15882. llvm-svn: 181286	2013-05-07 04:37:05 +00:00
David Blaikie	684fc5331e	DebugInfo: Support imported modules in lexical blocks llvm-svn: 181271	2013-05-06 23:33:07 +00:00
David Majnemer	70f286d95f	InstCombine: (X ^ signbit) + C -> X + (signbit ^ C) llvm-svn: 181249	2013-05-06 21:21:31 +00:00
Bill Wendling	66bdab1322	Reduce attributes. llvm-svn: 181245	2013-05-06 20:57:23 +00:00
Rafael Espindola	fa4513a178	Split Alignment out of the Section Characteristics. The alignment is just a byte in the middle of Characteristics, not an independent flag. Making it an independent field in the yaml representation makes it more yamlio friendly. llvm-svn: 181243	2013-05-06 20:11:21 +00:00
Jean-Luc Duprat	1610d571db	Test results verified using FileCheck rather than grep \| count llvm-svn: 181234	2013-05-06 18:45:16 +00:00
Andrew Trick	9c72b071fe	Rotate multi-exit loops even if the latch was simplified. Test case by Michele Scandale! Fixes PR10293: Load not hoisted out of loop with multiple exits. There are few regressions with this patch, now tracked by rdar:13817079, and a roughly equal number of improvements. The regressions are almost certainly back luck because LoopRotate has very little idea of whether rotation is profitable. Doing better requires a more comprehensive solution. This checkin is a quick fix that lacks generality (PR10293 has a counter-example). But it trivially fixes the case in PR10293 without interfering with other cases, and it does satify the criteria that LoopRotate is a loop canonicalization pass that should avoid heuristics and special cases. I can think of two approaches that would probably be better in the long run. Ultimately they may both make sense. (1) LoopRotate should check that the current header would make a good loop guard, and that the loop does not already has a sufficient guard. The artifical SimplifiedLoopLatch check would be unnecessary, and the design would be more general and canonical. Two difficulties: - We need a strong guarantee that we won't endlessly rotate, so the analysis would need to be precise in order to avoid the SimplifiedLoopLatch precondition. - Analysis like this are usually based on SCEV, which we don't want to rely on. (2) Rotate on-demand in late loop passes. This could even be done by shoving the loop back on the queue after the optimization that needs it. This could work well when we find LICM opportunities in multi-branch loops. This requires some work, and it doesn't really solve the problem of SCEV wanting a loop guard before the analysis. llvm-svn: 181230	2013-05-06 17:58:18 +00:00
Tom Stellard	043de4c5af	R600: Emit config values in register / value pairs Reviewed-by: Vincent Lejeune <vljn@ovi.com> Tested-By: Aaron Watry <awatry@gmail.com> llvm-svn: 181228	2013-05-06 17:50:51 +00:00
Tom Stellard	cfe2ef8fea	R600: Stop emitting the instruction type byte before each instruction Reviewed-by: Vincent Lejeune <vljn@ovi.com> Tested-By: Aaron Watry <awatry@gmail.com> llvm-svn: 181225	2013-05-06 17:50:44 +00:00
Tom Stellard	dbbcaf31b6	R600: Emit ISA for CALL_FS_* instructions Reviewed-by: Vincent Lejeune <vljn@ovi.com> Tested-By: Aaron Watry <awatry@gmail.com> llvm-svn: 181223	2013-05-06 17:50:26 +00:00
Ulrich Weigand	e7c6dfeb4b	[SystemZ] Update non-pic DWARF encodings As pointed out by Rafael Espindola, we should match the DWARF encodings produced by GCC in both pic and non-pic modes. This was not the case for the non-pic case. This patch changes all DWARF encodings to DW_EH_PE_absptr for the non-pic case, just like GCC does. The test case is updated to check for both variants. llvm-svn: 181222	2013-05-06 17:28:30 +00:00
Adhemerval Zanella	e8bd03da5c	PowerPC: Fix unimplemented relocation on ppc64 This patch handles the R_PPC64_REL64 relocation type for powerpc64 for mcjit. llvm-svn: 181220	2013-05-06 17:21:23 +00:00
Jean-Luc Duprat	4189ef456a	Fix add4.ll test cmdline so that it passes llvm-svn: 181219	2013-05-06 17:18:47 +00:00
Jean-Luc Duprat	3e4fc3ef24	Provide InstCombines for the following 3 cases: A * (1 - (uitofp i1 C)) -> select C, 0, A B * (uitofp i1 C) -> select C, B, 0 select C, 0, A + select C, B, 0 -> select C, B, A These come up in code that has been hand-optimized from a select to a linear blend, on platforms where that may have mattered. We want to undo such changes with the following transform: A(1 - uitofp i1 C) + B(uitofp i1 C) -> select C, A, B llvm-svn: 181216	2013-05-06 16:55:50 +00:00
Tim Northover	7dbbc28f72	AArch64: use MCJIT by default and enable related tests. This just enables some testing I'd missed after implementing MCJIT support. llvm-svn: 181215	2013-05-06 16:51:08 +00:00
Ulrich Weigand	80435baa14	[SystemZ] Set up JIT/MCJIT test cases This patch adds the necessary configuration bits and #ifdef's to set up the JIT/MCJIT test cases for SystemZ. Like other recent targets, we do fully support MCJIT, but do not support the old JIT at all. Set up the lit config files accordingly, and disable old-JIT unit tests. Patch by Richard Sandiford. llvm-svn: 181207	2013-05-06 16:21:50 +00:00
Ulrich Weigand	982a21510e	[SystemZ] Add MC test cases This adds all MC tests for the SystemZ target. Patch by Richard Sandiford. llvm-svn: 181206	2013-05-06 16:20:58 +00:00
Ulrich Weigand	08bd6154d9	[SystemZ] Add DebugInfo test cases This adds all DebugInfo tests for the SystemZ target. This version of the patch incorporates feedback from reviews by Eric Christopher and Rafael Espindola. Thanks to all reviewers! Patch by Richard Sandiford. llvm-svn: 181205	2013-05-06 16:18:29 +00:00
Ulrich Weigand	9e3577ff44	[SystemZ] Add CodeGen test cases This adds all CodeGen tests for the SystemZ target. This version of the patch incorporates feedback from a review by Sean Silva. Thanks to all reviewers! Patch by Richard Sandiford. llvm-svn: 181204	2013-05-06 16:17:29 +00:00
Rafael Espindola	57dc142d40	Free the exception object. Should fix the vg bots. llvm-svn: 181195	2013-05-06 13:30:52 +00:00
Michael Kuperstein	ac868757d0	Fix slightly too aggressive conact_vector optimization. (Would sometimes optimize away conacts used to extend a vector with undef values) llvm-svn: 181186	2013-05-06 08:06:13 +00:00
Bill Wendling	b07a68ebb0	Add a testcase that checks that we generate functions with frame pointers or not depending upon the function attributes. llvm-svn: 181180	2013-05-06 05:45:57 +00:00
Rafael Espindola	072f4d9a1e	XFAIL for cygwin. Looks like symbol resolution is not working on cygwin, the test fails because __gxx_personality_v0 is not found. llvm-svn: 181179	2013-05-06 03:35:56 +00:00
Nadav Rotem	c70ef4e93c	Revert r164763 because it introduces new shuffles. Thanks Nick Lewycky for pointing this out. llvm-svn: 181177	2013-05-06 02:39:09 +00:00
Matt Arsenault	c23753a53e	Fix unchecked uses of DominatorTree in MemoryDependenceAnalysis. Use unknown results for places where it would be needed llvm-svn: 181176	2013-05-06 02:07:24 +00:00

... 5 6 7 8 9 ...

19834 Commits