llvm-project

Commit Graph

Author	SHA1	Message	Date
Tom Roeder	4b239ed3ec	Adding explicit triples to the ARM jumptable tests llvm-svn: 210288	2014-06-05 21:40:13 +00:00
Tom Roeder	44cb65fff1	Add a new attribute called 'jumptable' that creates jump-instruction tables for functions marked with this attribute. It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables. This also adds backend support for generating the jump-instruction tables on ARM and X86. Note that since the jumptable attribute creates a second function pointer for a function, any function marked with jumptable must also be marked with unnamed_addr. llvm-svn: 210280	2014-06-05 19:29:43 +00:00
Rafael Espindola	64c1e18033	Allow alias to point to an arbitrary ConstantExpr. This patch changes GlobalAlias to point to an arbitrary ConstantExpr and it is up to MC (or the system assembler) to decide if that expression is valid or not. This reduces our ability to diagnose invalid uses and how early we can spot them, but it also lets us do things like @test5 = alias inttoptr(i32 sub (i32 ptrtoint (i32* @test2 to i32), i32 ptrtoint (i32* @bar to i32)) to i32) An important implication of this patch is that the notion of aliased global doesn't exist any more. The alias has to encode the information needed to access it in its metadata (linkage, visibility, type, etc). Another consequence to notice is that getSection has to return a "const char ". It could return a NullTerminatedStringRef if there was such a thing, but when that was proposed the decision was to just uses "const char*" for that. llvm-svn: 210062	2014-06-03 02:41:57 +00:00
Christian Pirker	762b2c624f	ARMEB: Fix function return type f64 Reviewed at http://reviews.llvm.org/D3968 llvm-svn: 209990	2014-06-01 09:30:52 +00:00
Tim Northover	d622e1282c	SelectionDAG: skip barriers for unordered atomic operations Unordered is strictly weaker than monotonic, so if the latter doesn't have any barriers then the former certainly shouldn't. rdar://problem/16548260 llvm-svn: 209901	2014-05-30 14:41:51 +00:00
Tim Northover	86f60b7266	ARM: use AAPCS-style prologues for embedded MachO. Darwin prologues save their GPRs in two stages: a narrow push of r0-r7 & lr, followed by a wide push of the remaining registers if there are any. AAPCS uses a single push.w instruction. It turns out that, on average, enough registers get pushed that code is smaller in the AAPCS prologue, which is a nice property for M-class programmers. They also have other options available for back-traces, so can hopefully deal with the fact that FP & LR aren't adjacent in memory. rdar://problem/15909583 llvm-svn: 209895	2014-05-30 13:23:06 +00:00
Tim Northover	b4ddc0845a	ARM & AArch64: make use of common cmpxchg idioms after expansion The C and C++ semantics for compare_exchange require it to return a bool indicating success. This gets mapped to LLVM IR which follows each cmpxchg with an icmp of the value loaded against the desired value. When lowered to ldxr/stxr loops, this extra comparison is redundant: its results are implicit in the control-flow of the function. This commit makes two changes: it replaces that icmp with appropriate PHI nodes, and then makes sure earlyCSE is called after expansion to actually make use of the opportunities revealed. I've also added -{arm,aarch64}-enable-atomic-tidy options, so that existing fragile tests aren't perturbed too much by the change. Many of them either rely on undef/unreachable too pervasively to be restored to something well-defined (particularly while making sure they test the same obscure assert from many years ago), or depend on a particular CFG shape, which is disrupted by SimplifyCFG. rdar://problem/16227836 llvm-svn: 209883	2014-05-30 10:09:59 +00:00
Tim Northover	ce6538c38d	AArch64 & ARM: remove undefined behaviour from some tests. llvm-svn: 209880	2014-05-30 08:59:55 +00:00
Amara Emerson	ceeb1c4830	[ARM] Emit correct build attributes for the relocation models. Patch by Asiri Rathnayake. llvm-svn: 209656	2014-05-27 13:30:21 +00:00
Tim Northover	4f1909f1da	ARM: teach AAPCS-VFP to deal with Cortex-M4. Cortex-M4 only has single-precision floating point support, so any LLVM "double" type will have been split into 2 i32s by now. Fortunately, the consecutive-register framework turns out to be precisely what's needed to reconstruct the double and follow AAPCS-VFP correctly! rdar://problem/17012966 llvm-svn: 209650	2014-05-27 10:43:38 +00:00
Tim Northover	f9e798ba6a	Segmented stacks: omit __morestack call when there's no frame. Patch by Florian Zeitz llvm-svn: 209436	2014-05-22 13:03:43 +00:00
Saleem Abdulrasool	2bd1262a32	ARM: introduce llvm.arm.undefined intrinsic This intrinsic permits the emission of platform specific undefined sequences. ARM has reserved the 0xde opcode which takes a single integer parameter (ignored by the CPU). This permits the operating system to implement custom behaviour on this trap. The llvm.arm.undefined intrinsic is meant to provide a means for generating the target specific behaviour from the frontend. This is particularly useful for Windows on ARM which has made use of a series of these special opcodes. llvm-svn: 209390	2014-05-22 04:46:46 +00:00
Saleem Abdulrasool	8d60fdc50d	ARM: correct bundle generation for MOV32T relocations Although the previous code would construct a bundle and add the correct elements to it, it would not finalise the bundle. This resulted in the InternalRead markers not being added to the MachineOperands nor, more importantly, the externally visible defs to the bundle itself. So, although the bundle was not exposing the def, the generated code would be correct because there was no optimisations being performed. When optimisations were enabled, the post register allocator would kick in, and the hazard recognizer would reorder operations around the load which would define the value being operated upon. Rather than manually constructing the bundle, simply construct and finalise the bundle via the finaliseBundle call after both MIs have been emitted. This improves the code generation with optimisations where IMAGE_REL_ARM_MOV32T relocations are emitted. The changes to the other tests are the result of the bundle generation preventing the scheduler from hoisting the moves across the loads. The net effect of the generated code is equivalent, but, is much more identical to what is actually being lowered. llvm-svn: 209267	2014-05-21 01:25:24 +00:00
Benjamin Kramer	f3ad23551d	SDAG: Legalize vector BSWAP into a shuffle if the shuffle is legal but the bswap not. - On ARM/ARM64 we get a vrev because the shuffle matching code is really smart. We still unroll anything that's not v4i32 though. - On X86 we get a pshufb with SSSE3. Required more cleverness in isShuffleMaskLegal. - On PPC we get a vperm for v8i16 and v4i32. v2i64 is unrolled. llvm-svn: 209123	2014-05-19 13:12:38 +00:00
Saleem Abdulrasool	a521845381	ARM: improve WoA ABI conformance for frame register Windows on ARM uses R11 for the frame pointer even though the environment is a pure Thumb-2, thumb-only environment. Replicate this behaviour to improve Windows ABI compatibility. This register is used for fast stack walking, and thus is part of the Windows ABI. llvm-svn: 209085	2014-05-18 04:12:52 +00:00
Saleem Abdulrasool	2476759673	test: fix copy-paste mistake Accidental over-quoting of the match string. llvm-svn: 209058	2014-05-17 04:32:38 +00:00
Saleem Abdulrasool	46fed305db	ARM: use the proper target object format for WoA WoA uses COFF, not ELF. ARMISelLowering::createTLOF would previously return ELF for any non-MachO platform. This was a missed site when the original change for target format support for Windows on ARM was done. llvm-svn: 209057	2014-05-17 04:28:08 +00:00
Rafael Espindola	6b238633b7	Fix most of PR10367. This patch changes the design of GlobalAlias so that it doesn't take a ConstantExpr anymore. It now points directly to a GlobalObject, but its type is independent of the aliasee type. To avoid changing all alias related tests in this patches, I kept the common syntax @foo = alias i32* @bar to mean the same as now. The cases that used to use cast now use the more general syntax @foo = alias i16, i32* @bar. Note that GlobalAlias now behaves a bit more like GlobalVariable. We know that its type is always a pointer, so we omit the '*'. For the bitcode, a nice surprise is that we were writing both identical types already, so the format change is minimal. Auto upgrade is handled by looking through the casts and no new fields are needed for now. New bitcode will simply have different types for Alias and Aliasee. One last interesting point in the patch is that replaceAllUsesWith becomes smart enough to avoid putting a ConstantExpr in the aliasee. This seems better than checking and updating every caller. A followup patch will delete getAliasedGlobal now that it is redundant. Another patch will add support for an explicit offset. llvm-svn: 209007	2014-05-16 19:35:39 +00:00
David Blaikie	825f487b68	DebugInfo: Assume the CU's Subprogram list only contains definitions. DIBuilder maintains this invariant and the current DwarfDebug code could end up doing weird things if it contained declarations (such as putting the definition DIE inside a CU that contained the declaration - this doesn't seem like a good idea, so rather than adding logic to handle this case we'll just ban in for now & cross that bridge if we come to it later). llvm-svn: 209004	2014-05-16 18:26:53 +00:00
James Molloy	a70697e10e	Re-enable inline memcpy expansion for Thumb1. Patch by Moritz Roth! llvm-svn: 208994	2014-05-16 14:24:22 +00:00
Rafael Espindola	5a52b9f139	Revert "Implement global merge optimization for global variables." This reverts commit r208934. The patch depends on aliases to GEPs with non zero offsets. That is not supported and fairly broken. The good news is that GlobalAlias is being redesigned and will have support for offsets, so this patch should be a nice match for it. llvm-svn: 208978	2014-05-16 13:02:18 +00:00
Saleem Abdulrasool	056fc3da4a	ARM: add some integer/floating point conversion libcalls Add some Windows on ARM specific library calls. These are provided by msvcrt, and can be used to perform integer to floating-point conversions (and vice-versa) mirroring similar functions in the RTABI. llvm-svn: 208949	2014-05-16 05:41:33 +00:00
Jiangning Liu	932e1c3924	Implement global merge optimization for global variables. This commit implements two command line switches -global-merge-on-external and -global-merge-aligned, and both of them are false by default, so this optimization is disabled by default for all targets. For ARM64, some back-end behaviors need to be tuned to get this optimization further enabled. llvm-svn: 208934	2014-05-15 23:45:42 +00:00
Jay Foad	a0653a3e6c	Rename ComputeMaskedBits to computeKnownBits. "Masked" has been inappropriate since it lost its Mask parameter in r154011. llvm-svn: 208811	2014-05-14 21:14:37 +00:00
Christian Pirker	6692e7c116	ARM-BE: test files for vector argument passing Reviewed at http://reviews.llvm.org/D3766 llvm-svn: 208793	2014-05-14 16:59:44 +00:00
Logan Chien	95188b9092	Fix ARM EHABI when function has landingpad and nounwind. If the function has the landingpad instruction, then the handlerdata should be emitted even if the function has nouwnind attribute. Otherwise, following code will not work: void test1() noexcept { try { throw_exception(); } catch (...) { log_unexpected_exception(); } } Since the cantunwind was incorrectly emitted and the LSDA is not available. llvm-svn: 208791	2014-05-14 16:38:30 +00:00
Logan Chien	ba1b6951c3	More test case for r208715. The commit r208166 will cause some regression on ARM EHABI. This fix has been committed in r208715, and an assertion failure test case has been committed in r208770. This commit further extends the unittest so that the actual value in the handlerdata will be checked. llvm-svn: 208790	2014-05-14 16:37:32 +00:00
Evgeniy Stepanov	b4aa2b422b	Regression test for ARM EHABI breakage in r208166. llvm-svn: 208770	2014-05-14 11:13:31 +00:00
Christian Pirker	39db7ec81f	ARMEB: Fix byte order of EH frame unwinding instructions, with modified test file This commit was already commited as revision rL208689 and discussd in phabricator revision D3704. But the test file was crashing on OS X and windows. I fixed the test file in the same way as in rL208340. llvm-svn: 208711	2014-05-13 16:44:30 +00:00
Rafael Espindola	2e7eceb317	Revert "ARMEB: Fix byte order of EH frame unwinding instructions" This reverts commit r208689. The test was crashing on OS X and windows. llvm-svn: 208704	2014-05-13 15:19:56 +00:00
Christian Pirker	ea3514ecdb	ARMEB: Fix byte order of EH frame unwinding instructions llvm-svn: 208689	2014-05-13 11:41:49 +00:00
Louis Gerbarg	b4013235e3	Fix ARM bswap16.ll test on Windows Windows on ARM only supports thumb mode execution, so we have to explicitly pick some non-Windows OS to test ARM mode codegen. llvm-svn: 208638	2014-05-12 22:13:07 +00:00
Louis Gerbarg	efdcf23736	Add support bswap16 to/from memory compiling to rev16 on ARM/Thumb The current patterns for REV16 misses mostn __builtin_bswap16() due to legalization promoting the operands to from load/stores toi32s and then truncing/extending them. This patch adds new patterns that catch the resultant DAGs and codegens them to rev16 instructions. Tests included. rdar://15353652 llvm-svn: 208620	2014-05-12 19:53:52 +00:00
Christian Pirker	238c7c165b	ARM: Implement big endian bit-conversion for NEON type llvm-svn: 208538	2014-05-12 11:19:20 +00:00
Reid Kleckner	d0eda92845	Fix ARM intrinsics-overflow.ll test on Windows Windows on ARM only supports thumb mode execution, so we have to explicitly pick some non-Windows OS to test ARM mode codegen. llvm-svn: 208448	2014-05-09 21:52:48 +00:00
Louis Gerbarg	3342bf1451	Add custom lowering for add/sub with overflow intrinsics to ARM This patch adds support to ARM for custom lowering of the llvm.{u\|s}add.with.overflow.i32 intrinsics for i32/i64. This is particularly useful for handling idiomatic saturating math functions as generated by InstCombineCompare. Test cases included. rdar://14853450 llvm-svn: 208435	2014-05-09 17:02:49 +00:00
James Molloy	dd1aa14a21	Attempt to pacify the bots - this commit requires asserts. llvm-svn: 208424	2014-05-09 16:20:53 +00:00
Oliver Stannard	c24f2171ca	ARM: HFAs must be passed in consecutive registers When using the ARM AAPCS, HFAs (Homogeneous Floating-point Aggregates) must be passed in a block of consecutive floating-point registers, or on the stack. This means that unused floating-point registers cannot be back-filled with part of an HFA, however this can currently happen. This patch, along with the corresponding clang patch (http://reviews.llvm.org/D3083) prevents this. llvm-svn: 208413	2014-05-09 14:01:47 +00:00
Saleem Abdulrasool	40bca0afab	ARM: support PIC on Windows on ARM Handle lowering of global addresses for PIC mode compilation on Windows. Always use the movw/movt load to load the address as Windows on ARM requires ARMv7+ and is a pure Thumb environment. llvm-svn: 208385	2014-05-09 00:58:32 +00:00
Justin Bogner	7833d9facb	test/CodeGen: Check that the correct register is used in a store This tightens up r208351 to ensure that a store is fed with the correct value. Thanks to Quentin Colombet for spotting this! llvm-svn: 208368	2014-05-08 22:45:07 +00:00
Justin Bogner	1de42075fc	Make a CodeGen test more robust against vector register selection llvm-svn: 208351	2014-05-08 18:53:56 +00:00
Saleem Abdulrasool	39a939d7d2	test: fix test on Windows When building on Windows, the default target is Windows. Windows on ARM does not support ARM mode compilation, resulting in test failures. Simply specify a triple to ensure that we are testing the correct behaviour. llvm-svn: 208340	2014-05-08 17:11:29 +00:00
Christian Pirker	b5728191c2	ARM big endian function argument passing llvm-svn: 208316	2014-05-08 14:06:24 +00:00
Joerg Sonnenberger	cf86ce136c	Allow using normal .eh_frame based unwinding on ARM. Use the same encodings as x86. Use this exception model for NetBSD. llvm-svn: 208166	2014-05-07 07:49:34 +00:00
Saleem Abdulrasool	acd0338c61	ARM: fix WoA PEI instruction selection The ARM::BLX instruction is an ARM mode instruction. The Windows on ARM target is limited to Thumb instructions. Correctly use the thumb mode tBLXr instruction. This would manifest as an errant write into the object file as the instruction is 4-bytes in length rather than 2. The result would be a corrupted object file that would eventually result in an executable that would crash at runtime. llvm-svn: 208152	2014-05-07 03:03:27 +00:00
Joerg Sonnenberger	818e725158	If a function needs a frame pointer, but r11 (aka fp) has not been used, remove it from the list of unspilled registers. Otherwise the following attempt to keep the stack aligned by picking an extra GPR register to spill will not work as it picks up r11. llvm-svn: 208129	2014-05-06 20:43:01 +00:00
Renato Golin	c7aea40ec6	Implememting named register intrinsics This patch implements the infrastructure to use named register constructs in programs that need access to specific registers (bare metal, kernels, etc). So far, only the stack pointer is supported as a technology preview, but as it is, the intrinsic can already support all non-allocatable registers from any architecture. llvm-svn: 208104	2014-05-06 16:51:25 +00:00
Saleem Abdulrasool	e8a7afef86	CodeGen: correct memset emittance for WoA Windows on ARM does not conform to AEABI. However, memset would be emitted using the AEABI signature, resulting in inverted parameters. Handle this special case appropriately. llvm-svn: 207943	2014-05-04 23:13:21 +00:00
Saleem Abdulrasool	9c4716e4b6	CodeGen: strengthen WoA AEABI avoidance tests Add additional test cases for WoA AEABI avoidance checking. llvm-svn: 207942	2014-05-04 23:13:18 +00:00
Saleem Abdulrasool	25947c318b	ARM: support stack probe emission for Windows on ARM This introduces the stack lowering emission of the stack probe function for Windows on ARM. The stack on Windows on ARM is a dynamically paged stack where any page allocation which crosses a page boundary of the following guard page will cause a page fault. This page fault must be handled by the kernel to ensure that the page is faulted in. If this does not occur and a write access any memory beyond that, the page fault will go unserviced, resulting in an abnormal program termination. The watermark for the stack probe appears to be at 4080 bytes (for accommodating the stack guard canaries and stack alignment) when SSP is enabled. Otherwise, the stack probe is emitted on the page size boundary of 4096 bytes. llvm-svn: 207615	2014-04-30 07:05:07 +00:00
Saleem Abdulrasool	f8222631a5	ARM: partially handle 32-bit relocations for WoA IMAGE_REL_ARM_MOV32T relocations require that the movw/movt pair-wise relocation is not split up and reordered. When expanding the mov32imm pseudo-instruction, create a bundle if the machine operand is referencing an address. This helps ensure that the relocatable address load is not reordered by subsequent passes. Unfortunately, this only partially handles the case as the Constant Island Pass occurs after the instructions are unbundled and does not properly handle bundles. That is a more fundamental issue with the pass itself and beyond the scope of this change. llvm-svn: 207608	2014-04-30 04:54:58 +00:00
Tim Northover	aacce57d61	ARM: fix test after change to indirect symbol emission. llvm-svn: 207519	2014-04-29 10:13:10 +00:00
Tim Northover	2372301bcf	ARM: emit hidden stubs into a proper non_lazy_symbol_pointer section. rdar://problem/16660411 llvm-svn: 207517	2014-04-29 10:06:05 +00:00
Saleem Abdulrasool	99f0d458c3	ARM: remove @llvm.arm.sevl This intrinsic is no longer needed with the new @llvm.arm.hint(i32) intrinsic which provides a generic, extensible manner for adding hint instructions. This functionality can now be represented as @llvm.arm.hint(i32 5). llvm-svn: 207246	2014-04-25 17:51:25 +00:00
Saleem Abdulrasool	7e7c2f9ca6	ARM: provide a new generic hint intrinsic Introduce the llvm.arm.hint(i32) intrinsic that can be used to inject hints into the instruction stream. This is particularly useful for generating IR from a compiler where the user may inject an intrinsic (e.g. __yield). These are then pattern substituted into the correct instruction which already existed. llvm-svn: 207242	2014-04-25 17:24:24 +00:00
Reid Kleckner	feb1148ed6	Fix test/CodeGen/arm.ll The 'CHECK: add' line was occasionally matching against the filename, breaking the subsequent CHECK-NOT. Also use CHECK-LABEL. llvm-svn: 206936	2014-04-23 01:09:29 +00:00
Tim Northover	978d25f391	ARM: disable emission of __XYZvfp in soft-float environment. The point of these calls is to allow Thumb-1 code to make use of the VFP unit to perform its operations. This is not desirable with -msoft-float, since most of the reasons you'd want that apply equally to the runtime library. rdar://problem/13766161 llvm-svn: 206874	2014-04-22 10:10:09 +00:00
Akira Hatanaka	3d90f99d1a	Make FastISel::SelectInstruction return before target specific fast-isel code handles Intrinsic::trap if TargetOptions::TrapFuncName is set. This fixes a bug in which the trap function was not taken into consideration when a program was compiled without optimization (at -O0). <rdar://problem/16291933> llvm-svn: 206323	2014-04-15 21:30:06 +00:00
Akira Hatanaka	5638b89944	Fix a bug in which BranchProbabilityInfo wasn't setting branch weights of basic blocks inside loops correctly. Previously, BranchProbabilityInfo::calcLoopBranchHeuristics would determine the weights of basic blocks inside loops even when it didn't have enough information to estimate the branch probabilities correctly. This patch fixes the function to exit early if it doesn't see any exit edges or back edges and let the later heuristics determine the weights. This fixes PR18705 and <rdar://problem/15991090>. Differential Revision: http://reviews.llvm.org/D3363 llvm-svn: 206194	2014-04-14 16:56:19 +00:00
Richard Trieu	3df79775c5	Fix 2008-03-05-SxtInRegBug.ll so that the CHECK-NOT will not match the filename. llvm-svn: 206193	2014-04-14 16:53:50 +00:00
Richard Trieu	97a268d905	Add extra checks to mvn.ll test to prevent the "f1" check from matching on a directory name instead of the function name. llvm-svn: 206104	2014-04-12 04:47:04 +00:00
Hal Finkel	c3998306f4	Add the ability to use GEPs for address sinking in CGP The current memory-instruction optimization logic in CGP, which sinks parts of the address computation that can be adsorbed by the addressing mode, does this by explicitly converting the relevant part of the address computation into IR-level integer operations (making use of ptrtoint and inttoptr). For most targets this is currently not a problem, but for targets wishing to make use of IR-level aliasing analysis during CodeGen, the use of ptrtoint/inttoptr is a problem for two reasons: 1. BasicAA becomes less powerful in the face of the ptrtoint/inttoptr 2. In cases where type-punning was used, and BasicAA was used to override TBAA, BasicAA may no longer do so. (this had forced us to disable all use of TBAA in CodeGen; something which we can now enable again) This (use of GEPs instead of ptrtoint/inttoptr) is not currently enabled by default (except for those targets that use AA during CodeGen), and so aside from some PowerPC subtargets and SystemZ, there should be no change in behavior. We may be able to switch completely away from the ptrtoint/inttoptr sinking on all targets, but further testing is required. I've doubled-up on a number of existing tests that are sensitive to the address sinking behavior (including some store-merging tests that are sensitive to the order of the resulting ADD operations at the SDAG level). llvm-svn: 206092	2014-04-12 00:59:48 +00:00
Reid Kleckner	9c6582129a	Move the segmented stack switch to a function attribute This removes the -segmented-stacks command line flag in favor of a per-function "split-stack" attribute. Patch by Luqman Aden and Alex Crichton! llvm-svn: 205997	2014-04-10 22:58:43 +00:00
Saleem Abdulrasool	905b6d192c	ARM: yet another round of ARM test clean ups llvm-svn: 205586	2014-04-03 23:47:24 +00:00
Saleem Abdulrasool	717c991923	ARM: update even more tests More updating of tests to be explicit about the target triple rather than relying on the default target triple supporting ARM mode. Indicate to lit that object emission is not yet available for Windows on ARM. llvm-svn: 205545	2014-04-03 17:35:22 +00:00
Saleem Abdulrasool	7258735fa0	ARM: fixup more tests to specify the target more explicitly This changes the tests that were targeting ARM EABI to explicitly specify the environment rather than relying on the default. This breaks with the new Windows on ARM support when running the tests on Windows where the default environment is no longer EABI. Take the opportunity to avoid a pointless redirect (helps when trying to debug with providing a command line invocation which can be copy and pasted) and removing a few greps in favour of FileCheck. llvm-svn: 205541	2014-04-03 16:01:44 +00:00
Tim Northover	01b4aa9437	ARM: tell LLVM about zext properties of ldrexb/ldrexh Implementing this via ComputeMaskedBits has two advantages: + It actually works. DAGISel doesn't deal with the chains properly in the previous pattern-based solution, so they never trigger. + The information can be used in other DAG combines, as well as the trivial "get rid of truncs". For example if the trunc is in a different basic block. rdar://problem/16227836 llvm-svn: 205540	2014-04-03 15:10:35 +00:00
Tim Northover	70450c59a4	ARM: skip cmpxchg failure barrier if ordering is monotonic. The terminal barrier of a cmpxchg expansion will be either Acquire or SequentiallyConsistent. In either case it can be skipped if the operation has Monotonic requirements on failure. rdar://problem/15996804 llvm-svn: 205535	2014-04-03 13:06:54 +00:00
Tim Northover	c882eb0723	ARM: expand atomic ldrex/strex loops in IR The previous situation where ATOMIC_LOAD_WHATEVER nodes were expanded at MachineInstr emission time had grown to be extremely large and involved, to account for the subtly different code needed for the various flavours (8/16/32/64 bit, cmpxchg/add/minmax). Moving this transformation into the IR clears up the code substantially, and makes future optimisations much easier: 1. an atomicrmw followed by using the new value can be more efficient. As an IR pass, simple CSE could handle this efficiently. 2. Making use of cmpxchg success/failure orderings only has to be done in one (simpler) place. 3. The common "cmpxchg; did we store?" idiom can be exposed to optimisation. I intend to gradually improve this situation within the ARM backend and make sure there are no hidden issues before moving the code out into CodeGen to be shared with (at least ARM64/AArch64, though I think PPC & Mips could benefit too). llvm-svn: 205525	2014-04-03 11:44:58 +00:00
Silviu Baranga	a3106e6847	[ARM] When generating a vpaddl node the input lane type is not always the type of the add operation since extract_vector_elt can perform an extend operation. Get the input lane type from the vector on which we're performing the vpaddl operation on and extend or truncate it to the output type of the original add node. llvm-svn: 205523	2014-04-03 10:44:27 +00:00
Saleem Abdulrasool	009d0e96b7	ARM: fixup tests to specify the target more explicitly This changes the tests that were targeting ARM EABI to explicitly specify the environment rather than relying on the default. This breaks with the new Windows on ARM support when running the tests on Windows where the default environment is no longer EABI. Take the opportunity to avoid a pointless redirect (helps when trying to debug with providing a command line invocation which can be copy and pasted) and removing a few greps in favour of FileCheck. llvm-svn: 205465	2014-04-02 21:22:03 +00:00
Saleem Abdulrasool	cd1308296e	ARM: update subtarget information for Windows on ARM Update the subtarget information for Windows on ARM. This enables using the MC layer to target Windows on ARM. llvm-svn: 205459	2014-04-02 20:32:05 +00:00
Oliver Stannard	b14c625111	ARM: Add support for segmented stacks Patch by Alex Crichton, ILyoan, Luqman Aden and Svetoslav. llvm-svn: 205430	2014-04-02 16:10:33 +00:00
Renato Golin	d93295ea56	Remove duplicated DMB instructions ARM specific optimiztion, finding places in ARM machine code where 2 dmbs follow one another, and eliminating one of them. Patch by Reinoud Elhorst. llvm-svn: 205409	2014-04-02 09:03:43 +00:00
Tim Northover	1351030801	ARM: add cyclone CPU with ZeroCycleZeroing feature. The Cyclone CPU is similar to swift for most LLVM purposes, but does have two preferred instructions for zeroing a VFP register. This teaches LLVM about them. llvm-svn: 205309	2014-04-01 13:22:02 +00:00
David Blaikie	3464161070	DebugInfo: Avoid creating unnecessary/empty line tables and remove the special case of '0' in DwarfCompileUnit::initStmtList by just always using a label difference This moves one case of raw text checking down into the MCStreamer interfaces in the form of a virtual function, even if we ultimately end up consolidating on the one-or-many line tables issue one day, this is nicer in the interim. This just generally streamlines a bunch of use cases into a common code path. llvm-svn: 205287	2014-04-01 08:07:52 +00:00
Tim Northover	1ff5f29fb5	ARM: add intrinsics for the v8 ldaex/stlex We've already got versions without the barriers, so this just adds IR-level support for generating the new v8 ones. rdar://problem/16227836 llvm-svn: 204813	2014-03-26 14:39:31 +00:00
Renato Golin	c0a3c1d66b	Add @llvm.clear_cache builtin Implementing the LLVM part of the call to __builtin___clear_cache which translates into an intrinsic @llvm.clear_cache and is lowered by each target, either to a call to __clear_cache or nothing at all incase the caches are unified. Updating LangRef and adding some tests for the implemented architectures. Other archs will have to implement the method in case this builtin has to be compiled for it, since the default behaviour is to bail unimplemented. A Clang patch is required for the builtin to be lowered into the llvm intrinsic. This will be done next. llvm-svn: 204802	2014-03-26 12:52:28 +00:00
Saleem Abdulrasool	1425622ad8	test: fix CHECK lines Thanks to gix for pointing out that the CHECK-LABEL lines were incorrect! llvm-svn: 204700	2014-03-25 03:39:39 +00:00
Kevin Qin	67b9c50c53	Fix test command line to avoid generating output file. llvm-svn: 204437	2014-03-21 07:20:29 +00:00
Kevin Qin	275ce91243	Fix an assertion caused by using inline asm with indirect register inputs. llvm-svn: 204425	2014-03-21 02:14:50 +00:00
Weiming Zhao	0152485679	Fix PR19136: [ARM] Fix Folding SP Update into vpush/vpop Sicne MBB->computeRegisterLivenes() returns Dead for sub regs like s0, d0 is used in vpop instead of updating sp, which causes s0 dead before its use. This patch checks the liveness of each subreg to make sure the reg is actually dead. llvm-svn: 204411	2014-03-20 23:28:16 +00:00
Hao Liu	40b5ab8e5b	[ARM]Fix an assertion failure in A15SDOptimizer about DPair reg class by treating DPair as QPR. llvm-svn: 204304	2014-03-20 05:36:59 +00:00
Rafael Espindola	2fb5bc33a3	Remove the linker_private and linker_private_weak linkages. These linkages were introduced some time ago, but it was never very clear what exactly their semantics were or what they should be used for. Some investigation found these uses: * utf-16 strings in clang. * non-unnamed_addr strings produced by the sanitizers. It turns out they were just working around a more fundamental problem. For some sections a MachO linker needs a symbol in order to split the section into atoms, and llvm had no idea that was the case. I fixed that in r201700 and it is now safe to use the private linkage. When the object ends up in a section that requires symbols, llvm will use a 'l' prefix instead of a 'L' prefix and things just work. With that, these linkages were already dead, but there was a potential future user in the objc metadata information. I am still looking at CGObjcMac.cpp, but at this point I am convinced that linker_private and linker_private_weak are not what they need. The objc uses are currently split in * Regular symbols (no '\01' prefix). LLVM already directly provides whatever semantics they need. * Uses of a private name (start with "\01L" or "\01l") and private linkage. We can drop the "\01L" and "\01l" prefixes as soon as llvm agrees with clang on L being ok or not for a given section. I have two patches in code review for this. * Uses of private name and weak linkage. The last case is the one that one could think would fit one of these linkages. That is not the case. The semantics are * the linker will merge these symbol by name. * the linker will hide them in the final DSO. Given that the merging is done by name, any of the private (or internal) linkages would be a bad match. They allow llvm to rename the symbols, and that is really not what we want. From the llvm point of view, these objects should really be (linkonce\|weak)(_odr)?. For now, just keeping the "\01l" prefix is probably the best for these symbols. If we one day want to have a more direct support in llvm, IMHO what we should add is not a linkage, it is just a hidden_symbol attribute. It would be applicable to multiple linkages. For example, on weak it would produce the current behavior we have for objc metadata. On internal, it would be equivalent to private (and we should then remove private). llvm-svn: 203866	2014-03-13 23:18:37 +00:00
Mark Seaborn	3f73533a9d	Cleanup: Remove use of old "-enable-correct-eh-support" option from a test This option enables LowerInvoke's obsolete SJLJ EH support, but the target used in this test (ARM Darwin) no longer uses the LowerInvoke pass, so the option has no effect here. This target currently uses the newer SjLjEHPrepare pass instead. This cleanup will help with removing "-enable-correct-eh-support". Differential Revision: http://llvm-reviews.chandlerc.com/D3064 llvm-svn: 203810	2014-03-13 16:23:00 +00:00
Hans Wennborg	89050436e6	[ARM] Use symbolic register names in .cfi directives only with IAS (PR19110) This is a follow-up to r203635. Saleem pointed out that since symbolic register names are much easier to read, it would be good if we could turn them off only when we really need to because we're using an external assembler. Differential Revision: http://llvm-reviews.chandlerc.com/D3056 llvm-svn: 203806	2014-03-13 15:56:41 +00:00
Tim Northover	3cccc45a9f	ARM: correct Dwarf output for non-contiguous VFP saves. When the list of VFP registers to be saved was non-contiguous (so multiple vpush/vpop instructions were needed) these were being ordered oddly, as in: vpush {d8, d9} vpush {d11} This led to the layout in memory being [d11, d8, d9] which is ugly and doesn't match the CFI_INSTRUCTIONs we're generating either (so Dwarf info would be broken). This switches the order of vpush/vpop (in both prologue and epilogue, obviously) so that the Dwarf locations are correct again. rdar://problem/16264856 llvm-svn: 203655	2014-03-12 11:29:23 +00:00
Hans Wennborg	14863418ed	[ARM] Use DWARF register numbers for CFI directives in ELF assembly It seems gas can't handle CFI directives with VFP register names ("d12", etc.). This broke us trying to build Chromium for Android after 201423. A gas bug has been filed: https://sourceware.org/bugzilla/show_bug.cgi?id=16694 compnerd suggested making this conditional on whether we're using the integrated assembler or not. I'll look into that in a follow-up patch. Differential Revision: http://llvm-reviews.chandlerc.com/D3049 llvm-svn: 203635	2014-03-12 03:52:34 +00:00
Saleem Abdulrasool	0d96f3dd6e	ARM: honour -f{no-,}optimize-sibling-calls Use the options in the ARMISelLowering to control whether tail calls are optimised or not. Previously, this option was entirely ignored on the ARM target and only honoured on x86. This option is mostly useful in profiling scenarios. The default remains that tail call optimisations will be applied. llvm-svn: 203577	2014-03-11 15:09:54 +00:00
Saleem Abdulrasool	b720a6bab7	ARM: remove ancient -arm-tail-calls option This option is from 2010, designed to work around a linker issue on Darwin for ARM. According to grosbach this is no longer an issue and this option can safely be removed. llvm-svn: 203576	2014-03-11 15:09:49 +00:00
Saleem Abdulrasool	ec1ec1b416	ARM: enable tail call optimisation on Thumb 2 Tail call optimisation was previously disabled on all targets other than iOS5.0+. This enables the tail call optimisation on all Thumb 2 capable platforms. The test adjustments are to remove the IR hint "tail" to function invocation. The tests were designed assuming that tail call optimisations would not kick in which no longer holds true. llvm-svn: 203575	2014-03-11 15:09:44 +00:00
Tim Northover	e94a518a22	IR: add a second ordering operand to cmpxhg for failure The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 llvm-svn: 203559	2014-03-11 10:48:52 +00:00
Matt Arsenault	532db69984	Fix undefined behavior in vector shift tests. These were all shifting the same amount as the bitwidth. llvm-svn: 203519	2014-03-11 00:01:41 +00:00
Rafael Espindola	b1f25f1b93	Replace PROLOG_LABEL with a new CFI_INSTRUCTION. The old system was fairly convoluted: * A temporary label was created. * A single PROLOG_LABEL was created with it. * A few MCCFIInstructions were created with the same label. The semantics were that the cfi instructions were mapped to the PROLOG_LABEL via the temporary label. The output position was that of the PROLOG_LABEL. The temporary label itself was used only for doing the mapping. The new CFI_INSTRUCTION has a 1:1 mapping to MCCFIInstructions and points to one by holding an index into the CFI instructions of this function. I did consider removing MMI.getFrameInstructions completelly and having CFI_INSTRUCTION own a MCCFIInstruction, but MCCFIInstructions have non trivial constructors and destructors and are somewhat big, so the this setup is probably better. The net result is that we don't create temporary labels that are never used. llvm-svn: 203204	2014-03-07 06:08:31 +00:00
Oliver Stannard	d55e115b58	ARM: Correctly align arguments after a byval struct is passed on the stack llvm-svn: 202985	2014-03-05 15:25:27 +00:00
Adrian Prantl	7072073cc9	Debug info: Remove ARMAsmPrinter::EmitDwarfRegOp(). AsmPrinter can now scan the register file for sub- and super-registers. No functionality change intended. (Tests are updated because the comments in the assembler output are different.) llvm-svn: 202416	2014-02-27 17:56:08 +00:00
Daniel Sanders	91efd1c464	Stop test/CodeGen/ARM/a15.ll targetting non-ARM targets. Summary: Fixes an issue where a test attempts to use -mcpu=cortex-a15 on non-ARM targets. This triggers an assertion on MIPS since it doesn't know what ABI to use by default for unrecognized processors. Reviewers: rengolin Reviewed By: rengolin CC: llvm-commits, aemerson, rengolin Differential Revision: http://llvm-reviews.chandlerc.com/D2876 llvm-svn: 202256	2014-02-26 11:26:18 +00:00
Logan Chien	18583d71e8	Keep the link register for uwtable. The function with uwtable attribute might be visited by the stack unwinder, thus the link register should be considered as clobbered after the execution of the branch and link instruction (i.e. the definition of the machine instruction can't be ignored) even when the callee function are marked with noreturn. llvm-svn: 202165	2014-02-25 16:57:28 +00:00
Mark Seaborn	be266aa325	Use 16 byte stack alignment for NaCl on ARM NaCl's ARM ABI uses 16 byte stack alignment, so set that in ARMSubtarget.cpp. Using 16 byte alignment exposes an issue in code generation in which a varargs function leaves a 4 byte gap between the values of r1-r3 saved to the stack and the following arguments that were passed on the stack. (Previously, this code only needed to support 4 byte and 8 byte alignment.) With this issue, llc generated: varargs_func: sub sp, sp, #16 push {lr} sub sp, sp, #12 add r0, sp, #16 // Should be 20 stm r0, {r1, r2, r3} ldr r0, .LCPI0_0 // Address of va_list add r1, sp, #16 str r1, [r0] bl external_func Fix the bug by checking for "Align > 4". Also simplify the code by using OffsetToAlignment(), and update comments. Differential Revision: http://llvm-reviews.chandlerc.com/D2677 llvm-svn: 201497	2014-02-16 18:59:48 +00:00
Nico Rieck	a0abeb3548	Fix more broken CHECK lines llvm-svn: 201493	2014-02-16 13:28:39 +00:00
Nico Rieck	5ba5226ab9	Add extra CHECK prefix to tests with explicit prefix These tests mistakenly assume that CHECK is still available even if an explicit prefix is specified. llvm-svn: 201492	2014-02-16 13:28:15 +00:00
Nico Rieck	7647178738	Fix broken CHECK lines llvm-svn: 201479	2014-02-16 07:31:05 +00:00
Artyom Skrobov	f6830f47b8	Generate the DWARF stack frame decode operations in the function prologue for ARM/Thumb functions. Patch by Keith Walker! llvm-svn: 201423	2014-02-14 17:19:07 +00:00
Rafael Espindola	8459762c88	Add triples to try to fix the windows bots. llvm-svn: 201345	2014-02-13 16:49:47 +00:00
Daniel Sanders	753e17629d	Re-commit: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Changes since review (and last commit attempt): - Fixed test failures that were missed due to configuration of local build. (fixes crash.ll and a couple others). - Fixed tests that happened to pass because the local build was on X86 (should fix 2007-12-17-InvokeAsm.ll) - mature-mc-support.ll's should no longer require all targets to be compiled. (should fix ARM and PPC buildbots) - Object output (-filetype=obj and similar) now forces the integrated assembler to be enabled regardless of default setting or -no-integrated-as. (should fix SystemZ buildbots) Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201333	2014-02-13 14:44:26 +00:00
Tim Northover	914af6273b	ARM: remove floating-point patterns for @llvm.arm.neon.vabs The front-end is now generating the generic @llvm.fabs for this operation now, so the extra patterns are no longer needed. llvm-svn: 201314	2014-02-13 10:44:30 +00:00
Akira Hatanaka	a07ffb5b31	Pass edges weights to MachineBasicBlock::addSuccessor in TailDuplicatePass to preserve branch probability information. <rdar://problem/15893208> llvm-svn: 201245	2014-02-12 18:09:18 +00:00
Daniel Sanders	abe212a3b8	Revert r201237+r201238: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call It introduced multiple test failures in the buildbots. llvm-svn: 201241	2014-02-12 15:39:20 +00:00
Daniel Sanders	a7d504cf58	Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201237	2014-02-12 14:44:54 +00:00
Evan Cheng	57add3e4ee	Tweak ARM fastcc by adopting these two AAPCS rules: * CPRCs may be allocated to co-processor registers or the stack – they may never be allocated to core registers * When a CPRC is allocated to the stack, all other VFP registers should be marked as unavailable The difference is only noticeable in rare cases where there are a large number of floating point arguments (e.g. 7 doubles + additional float, double arguments). Although it's probably still better to avoid vmov as it can cause stalls in some older ARM cores. The other, more subtle benefit, is to minimize difference between the various calling conventions. rdar://16039676 llvm-svn: 201193	2014-02-11 23:49:31 +00:00
Tim Northover	b0430415e6	ARM: use natural LLVM IR for vshll instructions Similarly to the vshrn instructions, these are simple zext/sext + trunc operations. Using normal LLVM IR should allow for better code, and more sharing with the AArch64 backend. llvm-svn: 201093	2014-02-10 16:20:29 +00:00
Oliver Stannard	8dcaa761a2	ARM: r12 is callee-saved for interrupt handlers For A- and R-class processors, r12 is not normally callee-saved, but is for interrupt handlers. See AAPCS, 5.3.1.1, "Use of IP by the linker". llvm-svn: 201089	2014-02-10 14:24:23 +00:00
Tim Northover	170daafe01	ARM: use LLVM IR to represent the vshrn operation vshrn is just the combination of a right shift and a truncate (and the limits on the immediate value actually mean the signedness of the shift doesn't matter). Using that representation allows us to get rid of an ARM-specific intrinsic, share more code with AArch64 and hopefully get better code out of the mid-end optimisers. llvm-svn: 201085	2014-02-10 14:04:07 +00:00
Renato Golin	3dc5ade8bb	Fix Darwin bots from EHABI change llvm-svn: 200990	2014-02-07 20:32:32 +00:00
Renato Golin	78a6eba862	Remove -arm-disable-ehabi option llvm-svn: 200988	2014-02-07 20:12:49 +00:00
Oliver Stannard	1dc1034218	LLVM-1163: AAPCS-VFP violation when CPRC allocated to stack According to the AAPCS, when a CPRC is allocated to the stack, all other VFP registers should be marked as unavailable. I have also modified the rules for allocating non-CPRCs to the stack, to make it more explicit that all GPRs must be made unavailable. I cannot think of a case where the old version would produce incorrect answers, so there is no test for this. llvm-svn: 200970	2014-02-07 11:19:53 +00:00
Manman Ren	37c9267107	PGO branch weight: fix PR18752. Fix a bug triggered in IfConverterTriangle when CvtBB has multiple predecessors by getting the weights before removing a successor. llvm-svn: 200958	2014-02-07 00:38:56 +00:00
David Peixotto	b9b7362cdc	Fix PR18345: ldr= pseudo instruction produces incorrect code when using in inline assembly This patch fixes the ldr-pseudo implementation to work when used in inline assembly. The fix is to move arm assembler constant pools from the ARMAsmParser class to the ARMTargetStreamer class. Previously we kept the assembler generated constant pools in the ARMAsmParser object. This does not work for inline assembly because a new parser object is created for each blob of inline assembly. This patch moves the constant pools to the ARMTargetStreamer class so that the constant pool will remain alive for the entire code generation process. An ARMTargetStreamer class is now required for the arm backend. There was no existing implementation for MachO, only Asm and ELF. Instead of creating an empty MachO subclass, we decided to make the ARMTargetStreamer a non-abstract class and provide default (llvm_unreachable) implementations for the non constant-pool related methods. Differential Revision: http://llvm-reviews.chandlerc.com/D2638 llvm-svn: 200777	2014-02-04 17:22:40 +00:00
Tim Northover	fdbdb4b6d5	ARM & AArch64: merge NEON absolute compare intrinsics There was an extremely confusing proliferation of LLVM intrinsics to implement the vacge & vacgt instructions. This combines them all into two polymorphic intrinsics, shared across both backends. llvm-svn: 200768	2014-02-04 14:55:42 +00:00
Tim Northover	e42fb07618	ARM: fix fast-isel assertion failure Missing braces on if meant we inserted both ARM and Thumb load for a litpool entry. This didn't end well. rdar://problem/15959157 llvm-svn: 200752	2014-02-04 10:38:46 +00:00
David Blaikie	5e390e4df7	DebugInfo: Remove some unneeded conditionals now that DIBuilder no longer emits zero-length arrays as {i32 0} A bunch of test cases needed to be cleaned up for this, many my fault - when implementid imported modules I updated test cases by simply duplicating the prior metadata field - which wasn't always the empty metadata entry. llvm-svn: 200731	2014-02-04 01:23:52 +00:00
Tim Northover	24979d8e10	AArch64 & ARM: refactor crypto intrinsics to take scalars Some of the SHA instructions take a scalar i32 as one argument (largely because they work on 160-bit hash fragments). This wasn't reflected in the IR previously, with ARM and AArch64 choosing different types (<4 x i32> and <1 x i32> respectively) which was ugly. This makes all the affected intrinsics take a uniform "i32", allowing them to become non-polymorphic at the same time. llvm-svn: 200706	2014-02-03 17:27:49 +00:00
Josh Magee	24c7f06333	[stackprotector] Implement the sspstrong rules for stack layout. This changes the PrologueEpilogInserter and LocalStackSlotAllocation passes to follow the extended stack layout rules for sspstrong and sspreq. The sspstrong layout rules are: 1. Large arrays and structures containing large arrays (>= ssp-buffer-size) are closest to the stack protector. 2. Small arrays and structures containing small arrays (< ssp-buffer-size) are 2nd closest to the protector. 3. Variables that have had their address taken are 3rd closest to the protector. Differential Revision: http://llvm-reviews.chandlerc.com/D2546 llvm-svn: 200601	2014-02-01 01:36:16 +00:00
Manman Ren	4ece7452ba	PGO branch weight: update edge weights in SelectionDAGBuilder. When converting from "or + br" to two branches, or converting from "and + br" to two branches, we correctly update the edge weights of the two branches. The previous attempt at r200431 was reverted at r200434 because of two testing case failures. I modified my patch a little, but forgot to re-run "make check-all". Testing case CodeGen/ARM/lsr-unfolded-offset.ll is updated because of the patch's impact on branch probability which causes changes in spill placement. llvm-svn: 200502	2014-01-31 00:42:44 +00:00
Evgeniy Stepanov	02bc78b9cd	Reenable ARM EHABI on Android. Broken in r200388. llvm-svn: 200466	2014-01-30 14:18:25 +00:00
Manman Ren	b681918ddd	PGO branch weight: update edge weights in IfConverter. This commit only handles IfConvertTriangle. To update edge weights of a successor, one interface is added to MachineBasicBlock: /// Set successor weight of a given iterator. setSuccWeight(succ_iterator I, uint32_t weight) An existing testing case test/CodeGen/Thumb2/v8_IT_5.ll is updated, since we now correctly update the edge weights, the cold block is placed at the end of the function and we jump to the cold block. llvm-svn: 200428	2014-01-29 23:18:47 +00:00
Renato Golin	8cea6e8fc6	Enable EHABI by default After all hard work to implement the EHABI and with the test-suite passing, it's time to turn it on by default and allow users to disable it as a work-around while we fix the eventual bugs that show up. This commit also remove the -arm-enable-ehabi-descriptors, since we want the tables to be printed every time the EHABI is turned on for non-Darwin ARM targets. Although MCJIT EHABI is not working yet (needs linking with the right libraries), this commit also fixes some relocations on MCJIT regarding the EH tables/lib calls, and update some tests to avoid using EH tables when none are needed. The EH tests in the test-suite that were previously disabled on ARM now pass with these changes, so a follow-up commit on the test-suite will re-enable them. llvm-svn: 200388	2014-01-29 11:50:56 +00:00
David Woodhouse	21bfc71752	[ARM] Remove superfluous inline asm mode switch test llvm-svn: 200361	2014-01-29 00:49:28 +00:00
David Woodhouse	7db3705f9e	Tests for mode switching 1. test that inlineasm works 2. test that relaxable instructions are re-encoded in the correct mode. llvm-svn: 200351	2014-01-28 23:13:30 +00:00
David Peixotto	b76f55f74a	Fix unsupported addressing mode assertion for pld Summary: This commit gives an address mode to the PLD instruction. We were getting an assertion failure in the frame lowering code because we had code that was doing a pld of a stack allocated address. The frame lowering was checking the address mode and then asserting because pld had none defined. This commit fixes pld for arm mode. There was a previous fix for thumb mode in a separate commit. The commit for thumb mode added a test in a separate file because it would otherwise fail for arm. This commit moves the thumb test back into the prefetch.ll file and adds the corresponding arm test. Differential Revision: http://llvm-reviews.chandlerc.com/D2622 llvm-svn: 200248	2014-01-27 21:39:04 +00:00
Juergen Ributzka	f26beda7c7	Revert "Revert "Add Constant Hoisting Pass" (r200034)" This reverts commit r200058 and adds the using directive for ARMTargetTransformInfo to silence two g++ overload warnings. llvm-svn: 200062	2014-01-25 02:02:55 +00:00
Hans Wennborg	4d67a2e85a	Revert "Add Constant Hoisting Pass" (r200034) This commit caused -Woverloaded-virtual warnings. The two new TargetTransformInfo::getIntImmCost functions were only added to the superclass, and to the X86 subclass. The other targets were not updated, and the warning highlighted this by pointing out that e.g. ARMTTI::getIntImmCost was hiding the two new getIntImmCost variants. We could pacify the warning by adding "using TargetTransformInfo::getIntImmCost" to the various subclasses, or turning it off, but I suspect that it's wrong to leave the functions unimplemnted in those targets. The default implementations return TCC_Free, which I don't think is right e.g. for ARM. llvm-svn: 200058	2014-01-25 01:18:18 +00:00
Juergen Ributzka	4f3df4ad64	Add Constant Hoisting Pass Retry commit r200022 with a fix for the build bot errors. Constant expressions have (unlike instructions) module scope use lists and therefore may have users in different functions. The fix is to simply ignore these out-of-function uses. llvm-svn: 200034	2014-01-24 20:18:00 +00:00
Juergen Ributzka	50e7e80d00	Revert "Add Constant Hoisting Pass" This reverts commit r200022 to unbreak the build bots. llvm-svn: 200024	2014-01-24 18:40:30 +00:00
Juergen Ributzka	38b67d0caf	Add Constant Hoisting Pass This pass identifies expensive constants to hoist and coalesces them to better prepare it for SelectionDAG-based code generation. This works around the limitations of the basic-block-at-a-time approach. First it scans all instructions for integer constants and calculates its cost. If the constant can be folded into the instruction (the cost is TCC_Free) or the cost is just a simple operation (TCC_BASIC), then we don't consider it expensive and leave it alone. This is the default behavior and the default implementation of getIntImmCost will always return TCC_Free. If the cost is more than TCC_BASIC, then the integer constant can't be folded into the instruction and it might be beneficial to hoist the constant. Similar constants are coalesced to reduce register pressure and materialization code. When a constant is hoisted, it is also hidden behind a bitcast to force it to be live-out of the basic block. Otherwise the constant would be just duplicated and each basic block would have its own copy in the SelectionDAG. The SelectionDAG recognizes such constants as opaque and doesn't perform certain transformations on them, which would create a new expensive constant. This optimization is only applied to integer constants in instructions and simple (this means not nested) constant cast experessions. For example: %0 = load i64* inttoptr (i64 big_constant to i64*) Reviewed by Eric llvm-svn: 200022	2014-01-24 18:23:08 +00:00
Alp Toker	cb40291100	Fix known typos Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018	2014-01-24 17:20:08 +00:00
Rafael Espindola	f8f15bf670	Don't use "llc -filetype=obj" now that the codepath is the same. r200011 remove the special codepaths in MC for inline asm, so we can now test all the logic with just llc + llvm-mc. llvm-svn: 200013	2014-01-24 15:59:50 +00:00
Weiming Zhao	5930ae6cc2	[Thumbv8] Fix the value of BLXOperandIndex of isV8EligibleForIT Originally, BLX was passed as operand #0 in MachineInstr and as operand #2 in MCInst. But now, it's operand #2 in both cases. This patch also removes unnecessary FileCheck in the test case added by r199127. llvm-svn: 199928	2014-01-23 19:55:33 +00:00
Tim Northover	55c625f222	ARM: use litpools for normal i32 imms when compiling minsize. With constant-sharing, litpool loads consume 4 + N2 bytes of code, but movw/movt pairs consume 8N. This means litpools are better than movw/movt even with just one use. Other materialisation strategies can still be better though, so the logic is a little odd. llvm-svn: 199891	2014-01-23 13:43:47 +00:00
Greg Fitzgerald	1f6a6086ae	Fix inline assembly that switches between ARM and Thumb modes This patch restores the ARM mode if the user's inline assembly does not. In the object streamer, it ensures that instructions following the inline assembly are encoded correctly and that correct mapping symbols are emitted. For the asm streamer, it emits a .arm or .thumb directive. This patch does not ensure that the inline assembly contains the ADR instruction to switch modes at runtime. The problem we need to solve is code like this: int foo(int a, int b) { int r = a + b; asm volatile( ".align 2 \n" ".arm \n" "add r0,r0,r0 \n" : : "r"(r)); return r+1; } If we compile this function in thumb mode then the inline assembly will switch to arm mode. We need to make sure that we switch back to thumb mode after emitting the inline assembly or we will incorrectly encode the instructions that follow (i.e. the assembly instructions for return r+1). Based on patch by David Peixotto Change-Id: Ib57f6d2d78a22afad5de8693fba6230ff56ba48b llvm-svn: 199818	2014-01-22 18:32:35 +00:00
James Molloy	43ccae1bb4	Remove the useless pseudo instructions VDUPfdf and VDUPfqf, replacing them with patterns to match VDUPLN. llvm-svn: 199675	2014-01-20 17:14:48 +00:00
Artyom Skrobov	10e76a4eda	[ARM] Do not generate Tag_DIV_use=AllowDIVExt when hardware div is non-optional: it should have the default value of AllowDIVIfExists llvm-svn: 199638	2014-01-20 10:18:42 +00:00
Saleem Abdulrasool	196c3212ba	ARM: update build attributes for ABI r2.09 Update names for the names as per the current ABI errata. Mark deprecated tags as such. llvm-svn: 199576	2014-01-19 08:25:35 +00:00
Amara Emerson	dba59eb3f4	Move the xscale build attribute test to the proper place and remove the old one. The encoding of build attributes is already tested in CodeGen/ARM/build-attributes-encoding.s llvm-svn: 199393	2014-01-16 15:11:54 +00:00
Jiangning Liu	4df2363a23	For ARM, fix assertuib failures for some ld/st 3/4 instruction with wirteback. llvm-svn: 199369	2014-01-16 09:16:13 +00:00
Weiming Zhao	fe26fd27b4	PR 18466: Fix ARM Pseudo Expansion When expanding neon pseudo stores, it may miss the implicit uses of sub regs, which may cause post RA scheduler reorder instructions that breakes anti dependency. For example: VST1d64QPseudo %R0<kill>, 16, %Q9_Q10, pred:14, pred:%noreg will be expanded to VST1d64Q %R0<kill>, 16, %D18, pred:14, pred:%noreg; An instruction that defines %D20 may be scheduled before the store by mistake. This patches adds implicit uses for such case. For the example above, it emits: VST1d64Q %R0<kill>, 8, %D18, pred:14, pred:%noreg, %Q9_Q10<imp-use> llvm-svn: 199282	2014-01-15 01:32:12 +00:00
Tim Northover	463a5f24d1	ARM: correctly determine final tBX_LR in Thumb1 functions The changes caused by folding an sp-adjustment into a "pop" previously disrupted the forward search for the final real instruction in a terminating block. This switches to a backward search (skipping debug instrs). This fixes PR18399. Patch by Zhaoshi. llvm-svn: 199266	2014-01-14 22:53:28 +00:00
Tim Northover	56cc5c92db	ARM: add constraint that RdLo != Rn != RdHi for v5 MLA insts. llvm-svn: 199212	2014-01-14 13:05:47 +00:00
Tim Northover	7d074a5ad6	ARM: add test for r199108. Oops. rdar://problem/15800156 llvm-svn: 199109	2014-01-13 14:20:25 +00:00
Benjamin Kramer	c10563d14e	Fix broken CHECK lines. llvm-svn: 199016	2014-01-11 21:06:00 +00:00

1 2 3 4 5 ...

2042 Commits