llvm-project

Commit Graph

Author	SHA1	Message	Date
Duncan P. N. Exon Smith	be7ea19b58	IR: Make metadata typeless in assembly Now that `Metadata` is typeless, reflect that in the assembly. These are the matching assembly changes for the metadata/value split in r223802. - Only use the `metadata` type when referencing metadata from a call intrinsic -- i.e., only when it's used as a `Value`. - Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode` when referencing it from call intrinsics. So, assembly like this: define @foo(i32 %v) { call void @llvm.foo(metadata !{i32 %v}, metadata !0) call void @llvm.foo(metadata !{i32 7}, metadata !0) call void @llvm.foo(metadata !1, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{metadata !3}, metadata !0) ret void, !bar !2 } !0 = metadata !{metadata !2} !1 = metadata !{i32* @global} !2 = metadata !{metadata !3} !3 = metadata !{} turns into this: define @foo(i32 %v) { call void @llvm.foo(metadata i32 %v, metadata !0) call void @llvm.foo(metadata i32 7, metadata !0) call void @llvm.foo(metadata i32* @global, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{!3}, metadata !0) ret void, !bar !2 } !0 = !{!2} !1 = !{i32* @global} !2 = !{!3} !3 = !{} I wrote an upgrade script that handled almost all of the tests in llvm and many of the tests in cfe (even handling many `CHECK` lines). I've attached it (or will attach it in a moment if you're speedy) to PR21532 to help everyone update their out-of-tree testcases. This is part of PR21532. llvm-svn: 224257	2014-12-15 19:07:53 +00:00
Chad Rosier	620fb2206d	[ARMConstantIsland] Insert tbb/tbh optimization where previous jump table resided. llvm-svn: 224165	2014-12-12 23:27:40 +00:00
Tim Northover	631cc9ce1a	ARM: allow constpool entry to be moved to the user's block in all cases. Normally entries can only move to a lower address, but when that wasn't viable, the user's block was considered anyway. Unfortunately, it went via createNewWater which wasn't designed to handle the case where there's already an island after the block. Unfortunately, the test we have is slow and fragile, and I couldn't reduce it to anything sane even with the @llvm.arm.space intrinsic. The test change here is recreating the previous one after the change. rdar://problem/18545506 llvm-svn: 221905	2014-11-13 17:58:53 +00:00
Akira Hatanaka	0d0c78180d	ARM: Fix a bug which was causing convergence failure in constant-island pass. The bug is in ARMConstantIslands::createNewWater where the upper bound of the new water split point is computed: // This could point off the end of the block if we've already got constant // pool entries following this block; only the last one is in the water list. // Back past any possible branches (allow for a conditional and a maximally // long unconditional). if (BaseInsertOffset + 8 >= UserBBI.postOffset()) { BaseInsertOffset = UserBBI.postOffset() - UPad - 8; DEBUG(dbgs() << format("Move inside block: %#x\n", BaseInsertOffset)); } The split point is supposed to be somewhere between the machine instruction that loads from the constant pool entry and the end of the basic block, before branch instructions. The code above is fine if the basic block is large enough and there are a sufficient number of instructions following the machine instruction. However, if the machine instruction is near the end of the basic block, BaseInsertOffset can point to the machine instruction or another instruction that precedes it, and this can lead to convergence failure. This commit fixes this bug by ensuring BaseInsertOffset is larger than the offset of the instruction following the constant-loading instruction. rdar://problem/18581150 llvm-svn: 220015	2014-10-17 01:31:47 +00:00
Rafael Espindola	11aaaeebe0	Delete -std-compile-opts. These days -std-compile-opts was just a silly alias for -O3. llvm-svn: 219951	2014-10-16 20:00:02 +00:00
Oliver Stannard	d4e0a4fd2c	[ARM] Allow selecting VRINT[APMXZR] and VCVT[BT] instructions for FPv5 Currently, we only codegen the VRINT[APMXZR] and VCVT[BT] instructions when targeting ARMv8, but they are actually present on any target with FP-ARMv8. Note that FP-ARMv8 is called FPv5 when is is part of an M-profile core, but they have the same instructions so we model them both as FPARMv8 in the ARM backend. llvm-svn: 218763	2014-10-01 13:13:18 +00:00
Oliver Stannard	37e4daab05	[ARM] Add support for Cortex-M7, FPv5-SP and FPv5-DP (LLVM) The Cortex-M7 has 3 options for its FPU: none, FPv5-SP-D16 and FPv5-DP-D16. FPv5 has the same instructions as FP-ARMv8, so it can be modelled using the same target feature, and all double-precision operations are already disabled by the fp-only-sp target features. llvm-svn: 218747	2014-10-01 09:02:17 +00:00
Reid Kleckner	2d9bb65b3d	ARM / x86_64 varargs: Don't save regparms in prologue without va_start There's no need to do this if the user doesn't call va_start. In the future, we're going to have thunks that forward these register parameters with musttail calls, and they won't need these spills for handling va_start. Most of the test suite changes are adding va_start calls to existing tests to keep things working. llvm-svn: 216294	2014-08-22 21:59:26 +00:00
Oliver Stannard	51b1d460cb	[ARM] Enable DP copy, load and store instructions for FPv4-SP The FPv4-SP floating-point unit is generally referred to as single-precision only, but it does have double-precision registers and load, store and GPR<->DPR move instructions which operate on them. This patch enables the use of these registers, the main advantage of which is that we now comply with the AAPCS-VFP calling convention. This partially reverts r209650, which added some AAPCS-VFP support, but did not handle return values or alignment of double arguments in registers. This patch also adds tests for Thumb2 code generation for floating-point instructions and intrinsics, which previously only existed for ARM. llvm-svn: 216172	2014-08-21 12:50:31 +00:00
Tim Northover	2a417b96d4	ARM: do not generate BLX instructions on Cortex-M CPUs. Particularly on MachO, we were generating "blx _dest" instructions on M-class CPUs, which don't actually exist. They happen to get fixed up by the linker into valid "bl _dest" instructions (which is why such a massive issue has remained largely undetected), but we shouldn't rely on that. llvm-svn: 214959	2014-08-06 11:13:14 +00:00
Akira Hatanaka	dc08c30df9	[ARM] In dynamic-no-pic mode, ARM's post-RA pseudo expansion was incorrectly expanding pseudo LOAD_STATCK_GUARD using instructions that are normally used in pic mode. This patch fixes the bug. <rdar://problem/17886592> llvm-svn: 214614	2014-08-02 05:40:40 +00:00
Akira Hatanaka	ba3af24c25	[stack protector] Add test cases for thumb and thumb2. <rdar://problem/12475629> llvm-svn: 213970	2014-07-25 19:47:46 +00:00
Tim Northover	14ff2df05c	ARM: spot SBFX-compatbile code expressed with sign_extend_inreg We were assuming all SBFX-like operations would have the shl/asr form, but often when the field being extracted is an i8 or i16, we end up with a SIGN_EXTEND_INREG acting on a shift instead. Simple enough to check for though. llvm-svn: 213754	2014-07-23 13:59:12 +00:00
Tim Northover	7ad2a0e0c2	ARM: add patterns for [su]xta[bh] from just a shift. Although the final shifter operand is a rotate, this actually only matters for the half-word extends when the amount == 24. Otherwise folding a shift in is just as good. llvm-svn: 213753	2014-07-23 13:59:07 +00:00
Chandler Carruth	9a0051cd59	[SDAG] Make the DAGCombine worklist not grow endlessly due to duplicate insertions. The old behavior could cause arbitrarily bad memory usage in the DAG combiner if there was heavy traffic of adding nodes already on the worklist to it. This commit switches the DAG combine worklist to work the same way as the instcombine worklist where we null-out removed entries and only add new entries to the worklist. My measurements of codegen time shows slight improvement. The memory utilization is unsurprisingly dominated by other factors (the IR and DAG itself I suspect). This change results in subtle, frustrating churn in the particular order in which DAG combines are applied which causes a number of minor regressions where we fail to match a pattern previously matched by accident. AFAICT, all of these should be using AddToWorklist to directly or should be written in a less brittle way. None of the changes seem drastically bad, and a few of the changes seem distinctly better. A major change required to make this work is to significantly harden the way in which the DAG combiner handle nodes which become dead (zero-uses). Previously, we relied on the ability to "priority-bump" them on the combine worklist to achieve recursive deletion of these nodes and ensure that the frontier of remaining live nodes all were added to the worklist. Instead, I've introduced a routine to just implement that precise logic with no indirection. It is a significantly simpler operation than that of the combiner worklist proper. I suspect this will also fix some other problems with the combiner. I think the x86 changes are really minor and uninteresting, but the avx512 change at least is hiding a "regression" (despite the test case being just noise, not testing some performance invariant) that might be looked into. Not sure if any of the others impact specific "important" code paths, but they didn't look terribly interesting to me, or the changes were really minor. The consensus in review is to fix any regressions that show up after the fact here. Thanks to the other reviewers for checking the output on other architectures. There is a specific regression on ARM that Tim already has a fix prepped to commit. Differential Revision: http://reviews.llvm.org/D4616 llvm-svn: 213727	2014-07-23 07:08:53 +00:00
Christian Pirker	c6308f59b2	ARM: Fix TPsoft for Thumb mode Reviewed at http://reviews.llvm.org/D4230 llvm-svn: 211601	2014-06-24 15:45:59 +00:00
Alp Toker	d3d017cf00	Reduce verbiage of lit.local.cfg files We can just split targets_to_build in one place and make it immutable. llvm-svn: 210496	2014-06-09 22:42:55 +00:00
Tim Northover	b4ddc0845a	ARM & AArch64: make use of common cmpxchg idioms after expansion The C and C++ semantics for compare_exchange require it to return a bool indicating success. This gets mapped to LLVM IR which follows each cmpxchg with an icmp of the value loaded against the desired value. When lowered to ldxr/stxr loops, this extra comparison is redundant: its results are implicit in the control-flow of the function. This commit makes two changes: it replaces that icmp with appropriate PHI nodes, and then makes sure earlyCSE is called after expansion to actually make use of the opportunities revealed. I've also added -{arm,aarch64}-enable-atomic-tidy options, so that existing fragile tests aren't perturbed too much by the change. Many of them either rely on undef/unreachable too pervasively to be restored to something well-defined (particularly while making sure they test the same obscure assert from many years ago), or depend on a particular CFG shape, which is disrupted by SimplifyCFG. rdar://problem/16227836 llvm-svn: 209883	2014-05-30 10:09:59 +00:00
James Molloy	556763d2ef	Fix the Load/Store optimization pass to work with Thumb1. Patch by Moritz Roth! llvm-svn: 208992	2014-05-16 14:14:30 +00:00
Reid Kleckner	9c6582129a	Move the segmented stack switch to a function attribute This removes the -segmented-stacks command line flag in favor of a per-function "split-stack" attribute. Patch by Luqman Aden and Alex Crichton! llvm-svn: 205997	2014-04-10 22:58:43 +00:00
Saleem Abdulrasool	c351ed2966	ARM: fix test case missed in previous roundup This should hopefully bring the last MSVC buildbot back to green! llvm-svn: 205596	2014-04-04 01:19:56 +00:00
Saleem Abdulrasool	905b6d192c	ARM: yet another round of ARM test clean ups llvm-svn: 205586	2014-04-03 23:47:24 +00:00
Saleem Abdulrasool	717c991923	ARM: update even more tests More updating of tests to be explicit about the target triple rather than relying on the default target triple supporting ARM mode. Indicate to lit that object emission is not yet available for Windows on ARM. llvm-svn: 205545	2014-04-03 17:35:22 +00:00
Oliver Stannard	b14c625111	ARM: Add support for segmented stacks Patch by Alex Crichton, ILyoan, Luqman Aden and Svetoslav. llvm-svn: 205430	2014-04-02 16:10:33 +00:00
Artyom Skrobov	1a6cd1d912	ARMv8 IfConversion must skip narrow instructions that a) define CPSR and b) wouldn't affect CPSR in an IT block llvm-svn: 202257	2014-02-26 11:27:28 +00:00
Nico Rieck	5ba5226ab9	Add extra CHECK prefix to tests with explicit prefix These tests mistakenly assume that CHECK is still available even if an explicit prefix is specified. llvm-svn: 201492	2014-02-16 13:28:15 +00:00
Renato Golin	78a6eba862	Remove -arm-disable-ehabi option llvm-svn: 200988	2014-02-07 20:12:49 +00:00
Manman Ren	b681918ddd	PGO branch weight: update edge weights in IfConverter. This commit only handles IfConvertTriangle. To update edge weights of a successor, one interface is added to MachineBasicBlock: /// Set successor weight of a given iterator. setSuccWeight(succ_iterator I, uint32_t weight) An existing testing case test/CodeGen/Thumb2/v8_IT_5.ll is updated, since we now correctly update the edge weights, the cold block is placed at the end of the function and we jump to the cold block. llvm-svn: 200428	2014-01-29 23:18:47 +00:00
Renato Golin	8cea6e8fc6	Enable EHABI by default After all hard work to implement the EHABI and with the test-suite passing, it's time to turn it on by default and allow users to disable it as a work-around while we fix the eventual bugs that show up. This commit also remove the -arm-enable-ehabi-descriptors, since we want the tables to be printed every time the EHABI is turned on for non-Darwin ARM targets. Although MCJIT EHABI is not working yet (needs linking with the right libraries), this commit also fixes some relocations on MCJIT regarding the EH tables/lib calls, and update some tests to avoid using EH tables when none are needed. The EH tests in the test-suite that were previously disabled on ARM now pass with these changes, so a follow-up commit on the test-suite will re-enable them. llvm-svn: 200388	2014-01-29 11:50:56 +00:00
Weiming Zhao	5930ae6cc2	[Thumbv8] Fix the value of BLXOperandIndex of isV8EligibleForIT Originally, BLX was passed as operand #0 in MachineInstr and as operand #2 in MCInst. But now, it's operand #2 in both cases. This patch also removes unnecessary FileCheck in the test case added by r199127. llvm-svn: 199928	2014-01-23 19:55:33 +00:00
Weiming Zhao	f66be56bf7	Fix PR 18369: [Thumbv8] asserts due to inconsistent CPSR liveness of IT blocks The issue is caused when Post-RA scheduler reorders a bundle instruction (IT block). However, it only flips the CPSR liveness of the bundle instruction, leaves the instructions inside the bundle unchanged, which causes inconstancy and crashes Thumb2SizeReduction.cpp::ReduceMBB(). llvm-svn: 199127	2014-01-13 18:47:54 +00:00
Benjamin Kramer	c10563d14e	Fix broken CHECK lines. llvm-svn: 199016	2014-01-11 21:06:00 +00:00
Joerg Sonnenberger	002a14765e	Enabling thumb2 mode used to force support for armv6t2. Replace this with a temporary assertion and adjust the various test cases. llvm-svn: 197224	2013-12-13 11:16:00 +00:00
David Peixotto	8ad70b3542	Add support for parsing ARM symbol variants on ELF targets ARM symbol variants are written with parens instead of @ like this: .word __GLOBAL_I_a(target1) This commit adds support for parsing these symbol variants in expressions. We introduce a new flag to MCAsmInfo that indicates the parser should use parens to parse the symbol variant. The expression parser is modified to look for symbol variants using parens instead of @ when the corresponding MCAsmInfo flag is true. The MCAsmInfo parens flag is enabled only for ARM on ELF. By adding this flag to MCAsmInfo, we are able to get rid of redundant ARM-specific symbol variants and use the generic variants instead (e.g. VK_GOT instead of VK_ARM_GOT). We use the new UseParensForSymbolVariant attribute in MCAsmInfo to correctly print the symbol variants for arm. To achive this we need to keep a handle to the MCAsmInfo in the MCSymbolRefExpr class that we can check when printing the symbol variant. Updated Tests: Changed case of symbol variant to match the generic kind. test/CodeGen/ARM/tls-models.ll test/CodeGen/ARM/tls1.ll test/CodeGen/ARM/tls2.ll test/CodeGen/Thumb2/tls1.ll test/CodeGen/Thumb2/tls2.ll PR18080 llvm-svn: 196424	2013-12-04 22:43:20 +00:00
Weiming Zhao	0da5cc0765	Enable generating legacy IT block for AArch32 By default, the behavior of IT block generation will be determinated dynamically base on the arch (armv8 vs armv7). This patch adds backend options: -arm-restrict-it and -arm-no-restrict-it. The former one restricts the generation of IT blocks (the same behavior as thumbv8) for both arches. The later one allows the generation of legacy IT block (the same behavior as ARMv7 Thumb2) for both arches. Clang will support -mrestrict-it and -mno-restrict-it, which is compatible with GCC. llvm-svn: 194592	2013-11-13 18:29:49 +00:00
Will Dietz	5cb7f4e3f2	MachineSink: Fix and tweak critical-edge breaking heuristic. Per original comment, the intention of this loop is to go ahead and break the critical edge (in order to sink this instruction) if there's reason to believe doing so might "unblock" the sinking of additional instructions that define registers used by this one. The idea is that if we have a few instructions to sink "together" breaking the edge might be worthwhile. This commit makes a few small changes to help better realize this goal: First, modify the loop to ignore registers defined by this instruction. We don't sink definitions of physical registers, and sinking an SSA definition isn't going to unblock an upstream instruction. Second, ignore uses of physical registers. Instructions that define physical registers are rejected for sinking, and so moving this one won't enable moving any defining instructions. As an added bonus, while virtual register use-def chains are generally small due to SSA goodness, iteration over the uses and definitions (used by hasOneNonDBGUse) for physical registers like EFLAGS can be rather expensive in practice. (This is the original reason for looking at this) Finally, to keep things simple continue to only consider this trick for registers that have a single use (via hasOneNonDBGUse), but to avoid spuriously breaking critical edges only do so if the definition resides in the same MBB and therefore this one directly blocks it from being sunk as well. If sinking them together is meant to be, let the iterative nature of this pass sink the definition into this block first. Update tests to accomodate this change, add new testcase where sinking avoids pipeline stalls. llvm-svn: 192608	2013-10-14 16:57:17 +00:00
Elena Demikhovsky	82a46ebe0a	Fixed a bug in dynamic allocation memory on stack. The alignment of allocated space was wrong, see Bugzila 17345. Done by Zvi Rackover <zvi.rackover@intel.com>. llvm-svn: 192573	2013-10-14 07:26:51 +00:00
Robert Wilhelm	f0cfb83bb4	Fix spelling intruction -> instruction. llvm-svn: 191610	2013-09-28 11:46:15 +00:00
Joey Gouly	a5153cb025	[ARMv8] Prevent generation of deprecated IT blocks on ARMv8 in Thumb mode. IT blocks can only be one instruction lonf, and can only contain a subset of the 16 instructions. Patch by Artyom Skrobov! llvm-svn: 190309	2013-09-09 14:21:49 +00:00
Tim Northover	1f1b2756a4	ARM: make sure ARM-mode pseudo-inst requires IsARM I'd forgotten that "Requires" blocks override rather than add to the constraints, so my pseudo-instruction was being selected in Thumb mode leading to nonsense instructions. rdar://problem/14817358 llvm-svn: 189096	2013-08-23 10:16:39 +00:00
Tim Northover	421804420d	ARM: use TableGen patterns to select CMOV operations. Back in the mists of time (2008), it seems TableGen couldn't handle the patterns necessary to match ARM's CMOV node that we convert select operations to, so we wrote a lot of fairly hairy C++ to do it for us. TableGen can deal with it now: there were a few minor differences to CodeGen (see tests), but nothing obviously worse that I could see, so we should probably address anything that does come up in a localised manner. llvm-svn: 188995	2013-08-22 09:57:11 +00:00
Jim Grosbach	6a7a727174	ARM: R9 is not safe to use for tcGPR. Indirect tail-calls shouldn't use R9 for the branch destination, as it's not reliably a call-clobbered register. rdar://14793425 llvm-svn: 188967	2013-08-22 00:14:24 +00:00
Daniel Dunbar	9efbedfd35	[tests] Cleanup initialization of test suffixes. - Instead of setting the suffixes in a bunch of places, just set one master list in the top-level config. We now only modify the suffix list in a few suites that have one particular unique suffix (.ml, .mc, .yaml, .td, .py). - Aside from removing the need for a bunch of lit.local.cfg files, this enables 4 tests that were inadvertently being skipped (one in Transforms/BranchFolding, a .s file each in DebugInfo/AArch64 and CodeGen/PowerPC, and one in CodeGen/SI which is now failing and has been XFAILED). - This commit also fixes a bunch of config files to use config.root instead of older copy-pasted code. llvm-svn: 188513	2013-08-16 00:37:11 +00:00
Lang Hames	24864fe150	Refactor AnalyzeBranch on ARM. The previous version did not always analyze indirect branches correctly. Under some circumstances, this led to the deletion of basic blocks that were the destination of indirect branches. In that case it left indirect branches to nowhere in the code. This patch replaces, and is more general than either of the previous fixes for indirect-branch-analysis issues, r181161 and r186461. For other branches (not indirect) this refactor should have almost identical behavior to the previous version. There are some corner cases where this refactor is able to analyze blocks that the previous version could not (e.g. this necessitated the update to thumb2-ifcvt2.ll). <rdar://problem/14464830> llvm-svn: 186735	2013-07-19 23:52:47 +00:00
Stephen Lin	6f36b45076	Update to more CodeGen tests to use CHECK-LABEL for labels corresponding to function definitions for more informative error messages. No functionality change. All changes were made by the following bash script: find test/CodeGen -name ".ll" \| \ while read NAME; do echo "$NAME" grep -q "^; RUN: llc.debug" $NAME && continue grep -q "^; RUN:.llvm-objdump" $NAME && continue grep -q "^; RUN: opt." $NAME && continue TEMP=`mktemp -t temp` cp $NAME $TEMP sed -n "s/^define [^@]@$[A-Za-z0-9_]$(.$/\1/p" < $NAME \| \ while read FUNC; do sed -i '' "s/;$[A-Za-z0-9_-]$$[A-Za-z0-9_-]$:$ $$FUNC[:] \$/;\1\2-LABEL:\3$FUNC:/g" $TEMP done sed -i '' "s/;$.$-LABEL-LABEL:/;\1-LABEL:/" $TEMP sed -i '' "s/;$.$-NEXT-LABEL:/;\1-NEXT:/" $TEMP sed -i '' "s/;$.$-NOT-LABEL:/;\1-NOT:/" $TEMP sed -i '' "s/;$.*$-DAG-LABEL:/;\1-DAG:/" $TEMP mv $TEMP $NAME done This script catches a superset of the cases caught by the script associated with commit r186280. It initially found some false positives due to unusual constructs in a minority of tests; all such cases were disambiguated first in commit r186621. llvm-svn: 186624	2013-07-18 22:47:09 +00:00
Stephen Lin	d24ab20e9b	Mass update to CodeGen tests to use CHECK-LABEL for labels corresponding to function definitions for more informative error messages. No functionality change and all updated tests passed locally. This update was done with the following bash script: find test/CodeGen -name ".ll" \| \ while read NAME; do echo "$NAME" if ! grep -q "^; RUN: llc.debug" $NAME; then TEMP=`mktemp -t temp` cp $NAME $TEMP sed -n "s/^define [^@]@$[A-Za-z0-9_]$(.$/\1/p" < $NAME \| \ while read FUNC; do sed -i '' "s/;$.$$[A-Za-z0-9_-]$:$ $$FUNC: \$/;\1\2-LABEL:\3$FUNC:/g" $TEMP done sed -i '' "s/;$.$-LABEL-LABEL:/;\1-LABEL:/" $TEMP sed -i '' "s/;$.$-NEXT-LABEL:/;\1-NEXT:/" $TEMP sed -i '' "s/;$.$-NOT-LABEL:/;\1-NOT:/" $TEMP sed -i '' "s/;$.*$-DAG-LABEL:/;\1-DAG:/" $TEMP mv $TEMP $NAME fi done llvm-svn: 186280	2013-07-14 06:24:09 +00:00
Stephen Lin	f799e3f944	Convert CodeGen//.ll tests to use the new CHECK-LABEL for easier debugging. No functionality change and all tests pass after conversion. This was done with the following sed invocation to catch label lines demarking function boundaries: sed -i '' "s/^;$ $$[A-Z0-9_]$:$ $test$[A-Za-z0-9_-]$:$ $$/;\1\2-LABEL:\3test\4:\5/g" test/CodeGen//*.ll which was written conservatively to avoid false positives rather than false negatives. I scanned through all the changes and everything looks correct. llvm-svn: 186258	2013-07-13 20:38:47 +00:00
Jim Grosbach	ebcad2e063	ARM: Fix incorrect pack pattern for thumb2 Propagate the fix from r185712 to Thumb2 codegen as well. Original commit message applies here as well: A "pkhtb x, x, y asr #num" uses the lower 16 bits of "y asr #num" and packs them in the bottom half of "x". An arithmetic and logic shift are only equivalent in this context if the shift amount is 16. We would be shifting in ones into the bottom 16bits instead of zeros if "y" is negative. rdar://14338767 llvm-svn: 185982	2013-07-09 22:59:22 +00:00
Tim Northover	52f77f5cda	ARM: allow predicated barriers in Thumb mode The barrier instructions are only "always-execute" in ARM mode, they can quite happily sit inside an IT block in Thumb. llvm-svn: 184964	2013-06-26 16:52:32 +00:00
Evan Cheng	4ec309700b	Cortex-R5 can issue Thumb2 integer division instructions. llvm-svn: 183275	2013-06-04 22:52:09 +00:00
Derek Schuff	bd7c6e5015	Fix ARM FastISel tests, as a first step to enabling ARM FastISel ARM FastISel is currently only enabled for iOS non-Thumb1, and I'm working on enabling it for other targets. As a first step I've fixed some of the tests. Changes to ARM FastISel tests: - Different triples don't generate the same relocations (especially movw/movt versus constant pool loads). Use a regex to allow either. - Mangling is different. Use a regex to allow either. - The reserved registers are sometimes different, so registers get allocated in a different order. Capture the names only where this occurs. - Add -verify-machineinstrs to some tests where it works. It doesn't work everywhere it should yet. - Add -fast-isel-abort to many tests that didn't have it before. - Split out the VarArg test from fast-isel-call.ll into its own test. This simplifies test setup because of --check-prefix. Patch by JF Bastien llvm-svn: 181801	2013-05-14 16:26:38 +00:00
Manman Ren	1a5ff287fd	TBAA: remove !tbaa from testing cases if not used. This will make it easier to turn on struct-path aware TBAA since the metadata format will change. llvm-svn: 180796	2013-04-30 17:52:57 +00:00
Jim Grosbach	48a91abc10	SDAG: Handle scalarizing an extend of a <1 x iN> vector. Just scalarize the element and rebuild a vector of the result type from that. rdar://13281568 llvm-svn: 176614	2013-03-07 05:47:54 +00:00
Jim Grosbach	a3c5c769d6	ARM: Creating a vector from a lane of another. The VDUP instruction source register doesn't allow a non-constant lane index, so make sure we don't construct a ARM::VDUPLANE node asking it to do so. rdar://13328063 http://llvm.org/bugs/show_bug.cgi?id=13963 llvm-svn: 176413	2013-03-02 20:16:24 +00:00
Kristof Beyls	0ba797e8f7	Make ARMAsmPrinter generate the correct alignment specifier syntax in instructions. The Printer will now print instructions with the correct alignment specifier syntax, like vld1.8 {d16}, [r0:64] llvm-svn: 175884	2013-02-22 10:01:33 +00:00
Jakob Stoklund Olesen	2ff4dc0ff2	Make RAFast::UsedInInstr indexed by register units. This fixes some problems with too conservative checking where we were marking all aliases of a register as used, and then also checking all aliases when allocating a register. <rdar://problem/13249625> llvm-svn: 175782	2013-02-21 19:35:21 +00:00
Jim Grosbach	3fa275e6f7	ARM: Allocation hints must make sure to be in the alloc order. When creating an allocation hint for a register pair, make sure the hint for the physical register reference is still in the allocation order. rdar://13240556 llvm-svn: 175541	2013-02-19 18:55:36 +00:00
Reid Kleckner	ab083f727b	FileCheck-ify some grep tests These tests in particular try to use escaped square brackets as an argument to grep, which is failing for me with native win32 python. It appears the backslash is being lost near the CreateProcess*() call. llvm-svn: 173506	2013-01-25 22:11:46 +00:00
Jakob Stoklund Olesen	ac6cfa41d6	Remove some register allocation order dependencies. llvm-svn: 172874	2013-01-19 00:03:32 +00:00
Evan Cheng	ddc0cb6dc5	On some ARM cpus, flags setting movs with shifter operand, i.e. lsl, lsr, asr, are more expensive than the non-flag setting variant. Teach thumb2 size reduction pass to avoid generating them unless we are optimizing for size. rdar://12892707 llvm-svn: 170728	2012-12-20 19:59:30 +00:00
Dmitri Gribenko	1c704355cf	Fix typos in CHECK lines. Patch by Alexander Zinenko. llvm-svn: 169547	2012-12-06 21:24:47 +00:00
Jakob Stoklund Olesen	e46a1046c0	Add GPRPair Register class to ARM. Some instructions in ARM require 2 even-odd paired GPRs. This patch adds support for such register class. Patch by Weiming Zhao! llvm-svn: 166816	2012-10-26 21:29:15 +00:00
Evan Cheng	59ed7d45a6	Fix a miscompilation caused by a typo. When turning a adde with negative value into a sbc with a positive number, the immediate should be complemented, not negated. Also added a missing pattern for ARM codegen. rdar://12559385 llvm-svn: 166613	2012-10-24 19:53:01 +00:00
Bob Wilson	e8a549cd92	Add LLVM support for Swift. llvm-svn: 164899	2012-09-29 21:43:49 +00:00
Evan Cheng	90ae8f8442	Use vld1 / vst2 for unaligned v2f64 load / store. e.g. Use vld1.16 for 2-byte aligned address. Based on patch by David Peixotto. Also use vld1.64 / vst1.64 with 128-bit alignment to take advantage of alignment hints. rdar://12090772, rdar://12238782 llvm-svn: 164089	2012-09-18 01:42:45 +00:00
Jakob Stoklund Olesen	f831059f60	Use predication instead of pseudo-opcodes when folding into MOVCC. Now that it is possible to dynamically tie MachineInstr operands, predicated instructions are possible in SSA form: %vreg3<def> = SUBri %vreg1, -2147483647, pred:14, pred:%noreg, %opt:%noreg %vreg4<def,tied1> = MOVCCr %vreg3<tied0>, %vreg1, %pred:12, pred:%CPSR Becomes a predicated SUBri with a tied imp-use: SUBri %vreg1, -2147483647, pred:13, pred:%CPSR, opt:%noreg, %vreg1<imp-use,tied0> This means that any instruction that is safe to move can be folded into a MOVCC, and the *CC pseudo-instructions are no longer needed. The test case changes reflect that Thumb2SizeReduce recognizes the predicated instructions. It didn't understand the pseudos. llvm-svn: 163274	2012-09-05 23:58:02 +00:00
Arnold Schwaighofer	f00fb1c581	Patch to implement UMLAL/SMLAL instructions for the ARM architecture This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! llvm-svn: 163136	2012-09-04 14:37:49 +00:00
Jakob Stoklund Olesen	0ea1fce6b4	Add ADD and SUB to the predicable ARM instructions. It is not my plan to duplicate the entire ARM instruction set with predicated versions. We need a way of representing predicated instructions in SSA form without requiring a separate opcode. Then the pseudo-instructions can go away. llvm-svn: 162061	2012-08-16 23:21:55 +00:00
Jakob Stoklund Olesen	6cb96120f1	Fold predicable instructions into MOVCC / t2MOVCC. The ARM select instructions are just predicated moves. If the select is the only use of an operand, the instruction defining the operand can be predicated instead, saving one instruction and decreasing register pressure. This implementation can turn AND/ORR/EOR instructions into their corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to predicate any instruction, but we don't yet support predicated instructions in SSA form. llvm-svn: 161994	2012-08-15 22:16:39 +00:00
Jush Lu	e67e07b901	[arm-fast-isel] Add support for vararg function calls. llvm-svn: 160500	2012-07-19 09:49:00 +00:00
Chandler Carruth	ff123d5c63	Fix the remaining TCL-style quotes found in the testsuite. This is another mechanical change accomplished though the power of terrible Perl scripts. I have manually switched some "s to 's to make escaping simpler. While I started this to fix tests that aren't run in all configurations, the massive number of tests is due to a really frustrating fragility of our testing infrastructure: things like 'grep -v', 'not grep', and 'expected failures' can mask broken tests all too easily. Essentially, I'm deeply disturbed that I can change the testsuite so radically without causing any change in results for most platforms. =/ llvm-svn: 159547	2012-07-02 19:09:46 +00:00
Bob Wilson	2297221028	Do not attempt to use ROR for Thumb1. Patch by Matt Fischer! llvm-svn: 159538	2012-07-02 17:22:47 +00:00
Chandler Carruth	872ac7cfad	Fix the TCL-style quoting in one random test that somehow slipped through my perl nets. With this, the test suite passes even if I force it to run with the built-in shell test logic, except for a test which REQUIREs shell. llvm-svn: 159529	2012-07-02 13:29:47 +00:00
Chandler Carruth	a5a29f970e	Convert all tests using TCL-style quoting to use shell-style quoting. This was done through the aid of a terrible Perl creation. I will not paste any of the horrors here. Suffice to say, it require multiple staged rounds of replacements, state carried between, and a few nested-construct-parsing hacks that I'm not proud of. It happens, by luck, to be able to deal with all the TCL-quoting patterns in evidence in the LLVM test suite. If anyone is maintaining large out-of-tree test trees, feel free to poke me and I'll send you the steps I used to convert things, as well as answer any painful questions etc. IRC works best for this type of thing I find. Once converted, switch the LLVM lit config to use ShTests the same as Clang. In addition to being able to delete large amounts of Python code from 'lit', this will also simplify the entire test suite and some of lit's architecture. Finally, the test suite runs 33% faster on Linux now. ;] For my 16-hardware-thread (2x 4-core xeon e5520): 36s -> 24s llvm-svn: 159525	2012-07-02 12:47:22 +00:00
Jakob Stoklund Olesen	41ebcda8f4	Add a test case for global live range splitting. llvm-svn: 157357	2012-05-23 23:42:23 +00:00
Jakob Stoklund Olesen	0ce90494e6	Add a last resort tryInstructionSplit() to RAGreedy. Live ranges with a constrained register class may benefit from splitting around individual uses. It allows the remaining live range to use a larger register class where it may allocate. This is like spilling to a different register class. This is only attempted on constrained register classes. <rdar://problem/11438902> llvm-svn: 157354	2012-05-23 22:37:27 +00:00
Jim Grosbach	da04fa0d02	FileCheck'ize test, and add a bit to test for r157221. llvm-svn: 157222	2012-05-21 23:50:00 +00:00
Jakob Stoklund Olesen	691ae3388f	Use the right register class for LDRrs. llvm-svn: 157152	2012-05-20 06:38:47 +00:00
Jim Grosbach	4b63d2ae1d	Refactor data-in-code annotations. Use a dedicated MachO load command to annotate data-in-code regions. This is the same format the linker produces for final executable images, allowing consistency of representation and use of introspection tools for both object and executable files. Data-in-code regions are annotated via ".data_region"/".end_data_region" directive pairs, with an optional region type. data_region_directive := ".data_region" { region_type } region_type := "jt8" \| "jt16" \| "jt32" \| "jta32" end_data_region_directive := ".end_data_region" The previous handling of ARM-style "$d.*" labels was broken and has been removed. Specifically, it didn't handle ARM vs. Thumb mode when marking the end of the section. rdar://11459456 llvm-svn: 157062	2012-05-18 19:12:01 +00:00
Jakob Stoklund Olesen	589c6eb95c	Remove -join-physregs from the test suite. This option has been disabled for a while, and it is going away so I can clean up the coalescer code. The tests that required physreg joining to be enabled were almost all of the form "tiny function with interference between arguments and return value". Such functions are usually inlined in the real world. The problem exposed by phys_subreg_coalesce-3.ll is real, but fairly rare. llvm-svn: 157027	2012-05-17 23:44:19 +00:00
Danil Malyshev	47aba39004	Added a regress test for the bug #9964 before close it. This bug was fixed by Jim Grosbach in #138879, thanks Jim! llvm-svn: 156505	2012-05-09 19:07:04 +00:00
Sebastian Pop	2420e8b7d5	Added missing CMN case in Thumb2SizeReduction pass so that LLVM emits 16-bits encoding of CMN instructions. llvm-svn: 156195	2012-05-04 19:53:56 +00:00
Evan Cheng	9f7ad310b5	If triple is armv7 / thumbv7 and a CPU is specified, do not automatically assume the feature set of v7a. This comes about if the user specifies something like -arch armv7 -mcpu=cortex-m3. We shouldn't be generating instructions such as uxtab in this case. rdar://11318438 llvm-svn: 155601	2012-04-26 01:13:36 +00:00
Chandler Carruth	1f5580b6f3	Fix updateTerminator to be resiliant to degenerate terminators where both fallthrough and a conditional branch target the same successor. Gracefully delete the conditional branch and introduce any unconditional branch needed to reach the actual successor. This fixes memory corruption in 2009-06-15-RegScavengerAssert.ll and possibly other tests. Also, while I'm here fix a latent bug I spotted by inspection. I never applied the same fundamental fix to this fallthrough successor finding logic that I did to the logic used when there are no conditional branches. As a consequence it would have selected landing pads had they be aligned in just the right way here. I don't have a test case as I spotted this by inspection, and the previous time I found this required have of TableGen's source code to produce it. =/ I hate backend bugs. ;] Thanks to Jim Grosbach for helping me reason through this and reviewing the fix. llvm-svn: 154867	2012-04-16 22:03:00 +00:00
Chandler Carruth	4190b507c5	Flip the new block-placement pass to be on by default. This is mostly to test the waters. I'd like to get results from FNT build bots and other bots running on non-x86 platforms. This feature has been pretty heavily tested over the last few months by me, and it fixes several of the execution time regressions caused by the inlining work by preventing inlining decisions from radically impacting block layout. I've seen very large improvements in yacr2 and ackermann benchmarks, along with the expected noise across all of the benchmark suite whenever code layout changes. I've analyzed all of the regressions and fixed them, or found them to be impossible to fix. See my email to llvmdev for more details. I'd like for this to be in 3.1 as it complements the inliner changes, but if any failures are showing up or anyone has concerns, it is just a flag flip and so can be easily turned off. I'm switching it on tonight to try and get at least one run through various folks' performance suites in case SPEC or something else has serious issues with it. I'll watch bots and revert if anything shows up. llvm-svn: 154816	2012-04-16 13:49:17 +00:00
Jakob Stoklund Olesen	37492eac8c	Don't break the IV update in TLI::SimplifySetCC(). LSR always tries to make the ICmp in the loop latch use the incremented induction variable. This allows the induction variable to be kept in a single register. When the induction variable limit is equal to the stride, SimplifySetCC() would break LSR's hard work by transforming: (icmp (add iv, stride), stride) --> (cmp iv, 0) This forced us to use lea for the IC update, preventing the simpler incl+cmp. <rdar://problem/7643606> <rdar://problem/11184260> llvm-svn: 154119	2012-04-05 20:30:20 +00:00
Jakob Stoklund Olesen	b6a7a89289	Don't kill the base register when expanding strd. When an strd instruction doesn't get the registers it wants, it can be expanded into two str instructions. Make sure the first str doesn't kill the base register in the case where the base and data registers are identical: t2STRi12 %R0<kill>, %R0, 4, pred:14, pred:%noreg t2STRi12 %R2<kill>, %R0, 8, pred:14, pred:%noreg <rdar://problem/11101911> llvm-svn: 153611	2012-03-28 23:07:03 +00:00
Jakob Stoklund Olesen	9e512120b7	Spill DPair registers, not just QPR. The arm_neon intrinsics can create virtual registers from the DPair register class which allows both even-odd and odd-even D-register pairs. This fixes PR12389. llvm-svn: 153603	2012-03-28 21:20:32 +00:00
Eli Bendersky	f33086052d	Continue cleanup of LIT, getting rid of the remaining artifacts from dejagnu * Removed test/lib/llvm.exp - it is no longer needed * Deleted the dg.exp reading code from test/lit.cfg. There are no dg.exp files left in the test suite so this code is no longer required. test/lit.cfg is now much shorter and clearer * Removed a lot of duplicate code in lit.local.cfg files that need access to the root configuration, by adding a "root" attribute to the TestingConfig object. This attribute is dynamically computed to provide the same information as was previously provided by the custom getRoot functions. * Documented the config.root attribute in docs/CommandGuide/lit.pod llvm-svn: 153408	2012-03-25 09:02:19 +00:00
Jakob Stoklund Olesen	92c15b2b2c	Enable ARM base pointer when calling functions with large arguments. When an outgoing call takes more than 2k of arguments on the stack, we don't allocate that call frame in the prolog, but adjust the stack pointer immediately before the call instead. This causes problems with the emergency spill slot because PEI can't track stack pointer adjustments on the second pass, and if the outgoing arguments are too big, SP can't be used to reach the emergency spill slot at all. Work around these problems by ensuring there is a base or frame pointer that can be used to access the emergency spill slot. <rdar://problem/10917166> llvm-svn: 151604	2012-02-28 01:15:01 +00:00
Jim Grosbach	c01104dfbf	Thumb2 size reduction fix for tied operands of tMUL. The tied source operand of tMUL is the second source operand, not the first like every other two-address thumb instruction. Special case it in the size reduction pass to make sure we create the tMUL instruction properly. llvm-svn: 151315	2012-02-24 00:33:36 +00:00
Eli Bendersky	924f9a671d	Replace all instances of dg.exp file with lit.local.cfg, since all tests are run with LIT now and now Dejagnu. dg.exp is no longer needed. Patch reviewed by Daniel Dunbar. It will be followed by additional cleanup patches. llvm-svn: 150664	2012-02-16 06:28:33 +00:00
Evan Cheng	6bb95253eb	After r147827 and r147902, it's now possible for unallocatable registers to be live across BBs before register allocation. This miscompiled 197.parser when a cmp + b are optimized to a cbnz instruction even though the CPSR def is live-in a successor. cbnz r6, LBB89_12 ... LBB89_12: ble LBB89_1 The fix consists of two parts. 1) Teach LiveVariables that some unallocatable registers might be liveouts so don't mark their last use as kill if they are. 2) ARM constantpool island pass shouldn't form cbz / cbnz if the conditional branch does not kill CPSR. rdar://10676853 llvm-svn: 148168	2012-01-14 01:53:46 +00:00
Jakob Stoklund Olesen	20f1dd5faf	Consider unknown alignment caused by OptimizeThumb2Instructions(). This function runs after all constant islands have been placed, and may shrink some instructions to their 2-byte forms. This can actually cause some constant pool entries to move out of range because of growing alignment padding. Treat instructions that may be shrunk the same as inline asm - they erode the known alignment bits. Also reinstate an old assertion in verify(). It is correct now that basic block offsets include alignments. Add a single large test case that will hopefully exercise many parts of the constant island pass. <rdar://problem/10670199> llvm-svn: 147885	2012-01-10 22:32:14 +00:00
Evan Cheng	0be4144a68	Allow machine-cse to look across MBB boundary when cse'ing instructions that define physical registers. It's currently very restrictive, only catching cases where the CE is in an immediate (and only) predecessor. But it catches a surprising large number of cases. rdar://10660865 llvm-svn: 147827	2012-01-10 02:02:58 +00:00
Jakob Stoklund Olesen	68a922c0e9	Enable aligned NEON spilling by default. Experiments show this to be a small speedup for modern ARM cores. llvm-svn: 147689	2012-01-06 22:19:37 +00:00
Jakob Stoklund Olesen	d110e2a83f	Reapply r146997, "Heed spill slot alignment on ARM." Now that canRealignStack() understands frozen reserved registers, it is safe to use it for aligned spill instructions. It will only return true if the registers reserved at the beginning of register allocation allow for dynamic stack realignment. <rdar://problem/10625436> llvm-svn: 147579	2012-01-05 00:26:57 +00:00
Evan Cheng	801d98b3f0	Fix more places which should be checking for iOS, not darwin. llvm-svn: 147513	2012-01-04 01:55:04 +00:00
Jakob Stoklund Olesen	1b7f2a7638	Revert r146997, "Heed spill slot alignment on ARM." This patch caused a miscompilation of oggenc because a frame pointer was suddenly needed halfway through register allocation. <rdar://problem/10625436> llvm-svn: 147487	2012-01-03 22:34:35 +00:00
Jakob Stoklund Olesen	0965585cb1	Experimental support for aligned NEON spills. ARM targets with NEON units have access to aligned vector loads and stores that are potentially faster than unaligned operations. Add support for spilling the callee-saved NEON registers to an aligned stack area using 16-byte aligned NEON loads and store. This feature is off by default, controlled by an -align-neon-spills command line option. llvm-svn: 147211	2011-12-23 00:36:18 +00:00

1 2 3 4 5 ...

518 Commits