llvm-project

Commit Graph

Author	SHA1	Message	Date
Ana Pazos	93a07c2185	Added support for mcpu krait - krait processor currently modeled with the same features as A9. - Krait processor additionally has VFP4 (fused multiply add/sub) and hardware division features enabled. - krait has currently the same Schedule model as A9 - krait cpu flag is not recognized by the GNU assembler yet, it is replaced with march=armv7-a to avoid a lower march from being used. llvm-svn: 196619	2013-12-06 22:48:17 +00:00
Weiming Zhao	43d8e6cb3b	Bug 18149: [AArch32] VSel instructions has no ARMCC field The current peephole optimizing for compare inst assumes an instr that uses CPSR has an MO for ARM Cond code.However, for VSEL instructions (vseqeq, vselgt, vselgt, vselvs), there is no such operand nor do they support the modification of Cond Code. llvm-svn: 196588	2013-12-06 17:56:48 +00:00
Andrew Trick	880e573d98	MI-Sched: handle latency of in-order operations with the new machine model. The per-operand machine model allows the target to define "unbuffered" processor resources. This change is a quick, cheap way to model stalls caused by the latency of operations that use such resources. This only applies when the processor's micro-op buffer size is non-zero (Out-of-Order). We can't precisely model in-order stalls during out-of-order execution, but this is an easy and effective heuristic. It benefits cortex-a9 scheduling when using the new machine model, which is not yet on by default. MI-Sched for armv7 was evaluated on Swift (and only not enabled because of a performance bug related to predication). However, we never evaluated Cortex-A9 performance on MI-Sched in its current form. This change adds MI-Sched functionality to reach performance goals on A9. The only remaining change is to allow MI-Sched to run as a PostRA pass. I evaluated performance using a set of options to estimate the performance impact once MI sched is default on armv7: -mcpu=cortex-a9 -disable-post-ra -misched-bench -scheditins=false For a simple saxpy loop I see a 1.7x speedup. Here are the llvm-testsuite results: (min run time over 2 runs, filtering tiny changes) Speedups: \| Benchmarks/BenchmarkGame/recursive \| 52.39% \| \| Benchmarks/VersaBench/beamformer \| 20.80% \| \| Benchmarks/Misc/pi \| 19.97% \| \| Benchmarks/Misc/mandel-2 \| 19.95% \| \| SPEC/CFP2000/188.ammp \| 18.72% \| \| Benchmarks/McCat/08-main/main \| 18.58% \| \| Benchmarks/Misc-C++/Large/sphereflake \| 18.46% \| \| Benchmarks/Olden/power \| 17.11% \| \| Benchmarks/Misc-C++/mandel-text \| 16.47% \| \| Benchmarks/Misc/oourafft \| 15.94% \| \| Benchmarks/Misc/flops-7 \| 14.99% \| \| Benchmarks/FreeBench/distray \| 14.26% \| \| SPEC/CFP2006/470.lbm \| 14.00% \| \| mediabench/mpeg2/mpeg2dec/mpeg2decode \| 12.28% \| \| Benchmarks/SmallPT/smallpt \| 10.36% \| \| Benchmarks/Misc-C++/Large/ray \| 8.97% \| \| Benchmarks/Misc/fp-convert \| 8.75% \| \| Benchmarks/Olden/perimeter \| 7.10% \| \| Benchmarks/Bullet/bullet \| 7.03% \| \| Benchmarks/Misc/mandel \| 6.75% \| \| Benchmarks/Olden/voronoi \| 6.26% \| \| Benchmarks/Misc/flops-8 \| 5.77% \| \| Benchmarks/Misc/matmul_f64_4x4 \| 5.19% \| \| Benchmarks/MiBench/security-rijndael \| 5.15% \| \| Benchmarks/Misc/flops-6 \| 5.10% \| \| Benchmarks/Olden/tsp \| 4.46% \| \| Benchmarks/MiBench/consumer-lame \| 4.28% \| \| Benchmarks/Misc/flops-5 \| 4.27% \| \| Benchmarks/mafft/pairlocalalign \| 4.19% \| \| Benchmarks/Misc/himenobmtxpa \| 4.07% \| \| Benchmarks/Misc/lowercase \| 4.06% \| \| SPEC/CFP2006/433.milc \| 3.99% \| \| Benchmarks/tramp3d-v4 \| 3.79% \| \| Benchmarks/FreeBench/pifft \| 3.66% \| \| Benchmarks/Ptrdist/ks \| 3.21% \| \| Benchmarks/Adobe-C++/loop_unroll \| 3.12% \| \| SPEC/CINT2000/175.vpr \| 3.12% \| \| Benchmarks/nbench \| 2.98% \| \| SPEC/CFP2000/183.equake \| 2.91% \| \| Benchmarks/Misc/perlin \| 2.85% \| \| Benchmarks/Misc/flops-1 \| 2.82% \| \| Benchmarks/Misc-C++-EH/spirit \| 2.80% \| \| Benchmarks/Misc/flops-2 \| 2.77% \| \| Benchmarks/NPB-serial/is \| 2.42% \| \| Benchmarks/ASC_Sequoia/CrystalMk \| 2.33% \| \| Benchmarks/BenchmarkGame/n-body \| 2.28% \| \| Benchmarks/SciMark2-C/scimark2 \| 2.27% \| \| Benchmarks/Olden/bh \| 2.03% \| \| skidmarks10/skidmarks \| 1.81% \| \| Benchmarks/Misc/flops \| 1.72% \| Slowdowns: \| Benchmarks/llubenchmark/llu \| -14.14% \| \| Benchmarks/Polybench/stencils/seidel-2d \| -5.67% \| \| Benchmarks/Adobe-C++/functionobjects \| -5.25% \| \| Benchmarks/Misc-C++/oopack_v1p8 \| -5.00% \| \| Benchmarks/Shootout/hash \| -2.35% \| \| Benchmarks/Prolangs-C++/ocean \| -2.01% \| \| Benchmarks/Polybench/medley/floyd-warshall \| -1.98% \| \| Polybench/linear-algebra/kernels/3mm \| -1.95% \| \| Benchmarks/McCat/09-vor/vor \| -1.68% \| llvm-svn: 196516	2013-12-05 17:55:58 +00:00
Andrew Trick	ff199a4b8e	Fix the A9 machine model. VTRN writes two registers. llvm-svn: 196514	2013-12-05 17:55:49 +00:00
Tim Northover	e4def5e228	ARM: fix yet another stack-folding bug We were trying to fold the stack adjustment into the wrong instruction in the situation where the entire basic-block was epilogue code. Really, it can only ever be valid to do the folding precisely where the "add sp, ..." would be placed so there's no need for a separate iterator to track that. Should fix PR18136. llvm-svn: 196493	2013-12-05 11:02:02 +00:00
Alp Toker	f907b891da	Correct word hyphenations This patch tries to avoid unrelated changes other than fixing a few hyphen-related ambiguities and contractions in nearby lines. llvm-svn: 196471	2013-12-05 05:44:44 +00:00
David Peixotto	8ad70b3542	Add support for parsing ARM symbol variants on ELF targets ARM symbol variants are written with parens instead of @ like this: .word __GLOBAL_I_a(target1) This commit adds support for parsing these symbol variants in expressions. We introduce a new flag to MCAsmInfo that indicates the parser should use parens to parse the symbol variant. The expression parser is modified to look for symbol variants using parens instead of @ when the corresponding MCAsmInfo flag is true. The MCAsmInfo parens flag is enabled only for ARM on ELF. By adding this flag to MCAsmInfo, we are able to get rid of redundant ARM-specific symbol variants and use the generic variants instead (e.g. VK_GOT instead of VK_ARM_GOT). We use the new UseParensForSymbolVariant attribute in MCAsmInfo to correctly print the symbol variants for arm. To achive this we need to keep a handle to the MCAsmInfo in the MCSymbolRefExpr class that we can check when printing the symbol variant. Updated Tests: Changed case of symbol variant to match the generic kind. test/CodeGen/ARM/tls-models.ll test/CodeGen/ARM/tls1.ll test/CodeGen/ARM/tls2.ll test/CodeGen/Thumb2/tls1.ll test/CodeGen/Thumb2/tls2.ll PR18080 llvm-svn: 196424	2013-12-04 22:43:20 +00:00
Chad Rosier	1d22b5d1c0	Update the UseFusedMAC definition to directly specify its dependence on having VFP4. Patch by Daniel Stewart! llvm-svn: 196390	2013-12-04 17:16:36 +00:00
James Molloy	8a25992f39	Addrspacecasts are no-ops on ARM. Testcase added. llvm-svn: 196269	2013-12-03 11:23:11 +00:00
Rafael Espindola	5113d166f5	Refactor the setting of PrivateGlobalPrefix. No functionality change. llvm-svn: 196170	2013-12-02 23:39:26 +00:00
Rafael Espindola	f4e6b29a03	Move getSymbolWithGlobalValueBase to TargetLoweringObjectFile. This allows it to be used in TargetLoweringObjectFileImpl.cpp. llvm-svn: 196117	2013-12-02 16:25:47 +00:00
Rafael Espindola	957cf6f9e1	Remove dead code. MO_JumpTableIndex and MO_ExternalSymbol don't show up on inline asm. Keeping parts of the old asm printer just to print inline asm to a string that we then parse back looks like a hack. llvm-svn: 196111	2013-12-02 15:36:37 +00:00
Tim Northover	dee8604caf	ARM: decide whether to use movw/movt based on "minsize" attribute. llvm-svn: 196102	2013-12-02 14:46:26 +00:00
Tim Northover	72360d201c	ARM: add pseudo-instructions for lit-pool global materialisation These are used by MachO only at the moment, and (much like the existing MOVW/MOVT set) work around the fact that the labels used in the actual instructions often contain PC-dependent components, which means that repeatedly materialising the same global can't be CSEed. With small modifications, it could be adapted to how ELF finds the address of _GLOBAL_OFFSET_TABLE_, which would give similar benefits in PIC mode there. llvm-svn: 196090	2013-12-02 10:35:41 +00:00
Rafael Espindola	50712a456d	Change the default of AsmWriterClassName and isMCAsmWriter. llvm-svn: 196065	2013-12-02 04:55:42 +00:00
Tim Northover	45479dcf49	ARM: fix bug in -Oz stack adjustment folding Previously, we clobbered callee-saved registers when folding an "add sp, #N" into a "pop {rD, ...}" instruction. This change checks whether a register we're going to add to the "pop" could actually be live outside the function before doing so and should fix the issue. This should fix PR18081. llvm-svn: 196046	2013-12-01 14:16:24 +00:00
NAKAMURA Takumi	226e10edff	[CMake] Let add_public_tablegen_target() provide intrinsics_gen, too. I think, in principle, intrinsics_gen may be added explicitly. That said, it can be added incidentally, since each target already has dependencies to llvm-tblgen. Almost all source files depend on both CommonTaleGen and intrinsics_gen. Explicit add_dependencies() have been pruned under lib/Target. llvm-svn: 195929	2013-11-28 17:04:31 +00:00
NAKAMURA Takumi	ce746c6c49	[CMake] Let add_public_tablegen_target responsible to provide dependency to CommonTableGen. add_public_tablegen_target adds *CommonTableGen to LLVM_COMMON_DEPENDS. LLVM_COMMON_DEPENDS affects add_llvm_library (and other add_target stuff) within its scope. llvm-svn: 195927	2013-11-28 17:04:04 +00:00
NAKAMURA Takumi	413518f1f8	[CMake] Prune include_directories() in llvm/lib/Target. add_llvm_target() sets them. llvm-svn: 195921	2013-11-28 14:53:30 +00:00
Tim Northover	fa36dfeeca	Darwin-ARM: use movw/movt for static relocations llvm-svn: 195759	2013-11-26 12:45:05 +00:00
Tim Northover	d34094e525	Fix indentation typo llvm-svn: 195660	2013-11-25 17:04:35 +00:00
Tim Northover	db962e2c45	ARM: remove special cases for Darwin dynamic-no-pic mode. These are handled almost identically to static mode (and ELF's global address materialisation), except that a symbol may have "$non_lazy_ptr" appended. This can be handled by passing appropriate flags along with the instruction instead of using entirely separate pseudo-instructions. llvm-svn: 195655	2013-11-25 16:24:52 +00:00
Tim Northover	dfe2156c91	ARM: remove unused patterns. There is no sane way for an LEApcrel (= single ADR) instruction to generate a global address on any ARM target I know of. Fortunately, no-one was trying to any more, but there were vestigial patterns. llvm-svn: 195644	2013-11-25 14:40:57 +00:00
Amara Emerson	34df448f7c	[ARM] Enable FeatureMP for Cortex-A5 by default. Patch by Oliver Stannard. llvm-svn: 195640	2013-11-25 13:17:15 +00:00
Richard Barton	c31078cded	Add support for Cortex-A12. Patch by Oliver Stannard! llvm-svn: 195448	2013-11-22 11:53:16 +00:00
Lang Hames	1ca1123598	Fix a typo where we were creating <def,kill> operands instead of <def,dead> ones. Add an assertion to make sure we catch this in the future. Fixes <rdar://problem/15464559>. llvm-svn: 195401	2013-11-22 00:46:32 +00:00
Artyom Skrobov	468ee230ea	[ARM] add basic Cortex-A7 support to LLVM backend llvm-svn: 195358	2013-11-21 14:03:21 +00:00
Juergen Ributzka	d12ccbd343	[weak vtables] Remove a bunch of weak vtables This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. The memory leaks in this version have been fixed. Thanks Alexey for pointing them out. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 195064	2013-11-19 00:57:56 +00:00
Alexey Samsonov	49109a279c	Revert r194865 and r194874. This change is incorrect. If you delete virtual destructor of both a base class and a subclass, then the following code: Base *foo = new Child(); delete foo; will not cause the destructor for members of Child class. As a result, I observe plently of memory leaks. Notable examples I investigated are: ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl. llvm-svn: 194997	2013-11-18 09:31:53 +00:00
Juergen Ributzka	dbedae89b9	[weak vtables] Remove a bunch of weak vtables This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 194865	2013-11-15 22:34:48 +00:00
Bob Wilson	9f3e6b25ee	Avoid illegal integer promotion in fastisel Stop folding constant adds into GEP when the type size doesn't match. Otherwise, the adds' operands are effectively being promoted, changing the conditions of an overflow. Results are different when: sext(a) + sext(b) != sext(a + b) Problem originally found on x86-64, but also fixed issues with ARM and PPC, which used similar code. <rdar://problem/15292280> Patch by Duncan Exon Smith! llvm-svn: 194840	2013-11-15 19:09:27 +00:00
Tim Northover	28adfbb0d1	ARM: produce friendly error for invalid inline asm We used to perform an invalid operation on an MVT and crash, which wasn't much fun. Patch by Oliver Stannard. llvm-svn: 194714	2013-11-14 17:15:39 +00:00
Weiming Zhao	0da5cc0765	Enable generating legacy IT block for AArch32 By default, the behavior of IT block generation will be determinated dynamically base on the arch (armv8 vs armv7). This patch adds backend options: -arm-restrict-it and -arm-no-restrict-it. The former one restricts the generation of IT blocks (the same behavior as thumbv8) for both arches. The later one allows the generation of legacy IT block (the same behavior as ARMv7 Thumb2) for both arches. Clang will support -mrestrict-it and -mno-restrict-it, which is compatible with GCC. llvm-svn: 194592	2013-11-13 18:29:49 +00:00
Tim Northover	8eaf1543e5	ARM: diagnose invalid system LDM/STM The system LDM and STM instructions can't usually writeback to the base register. The one exception is when an LDM is actually an exception-return (i.e. contains PC in the register list). (There's already a test that "ldm sp!, {r0-r3, pc}^" works, which is why there is no positive test). rdar://problem/15223374 llvm-svn: 194512	2013-11-12 21:32:41 +00:00
Bradley Smith	9aa8ac9f23	[ARM] Add support for FP_HP_extension build attribute llvm-svn: 194470	2013-11-12 10:38:05 +00:00
Artyom Skrobov	eff45103b3	[ARM] Add support for MVFR2 which is new in ARMv8 llvm-svn: 194416	2013-11-11 19:56:13 +00:00
Benjamin Kramer	3e9237a313	Remove some unnecessary temporary strings. llvm-svn: 194335	2013-11-09 22:48:13 +00:00
Logan Chien	a2630db16a	[arm] Refine ARMBuildAttrs.h. This commit cleans up some comments in ARMBuildAttrs.h. Besides, this commit fixes an error related to AllowWMMXv1 and AllowWMMXv2 (although they are not used currently.) llvm-svn: 194327	2013-11-09 14:16:52 +00:00
Tim Northover	93bcc66e73	ARM: fold prologue/epilogue sp updates into push/pop for code size ARM prologues usually look like: push {r7, lr} sub sp, sp, #4 If code size is extremely important, this can be optimised to the single instruction: push {r6, r7, lr} where we don't actually care about the contents of r6, but pushing it subtracts 4 from sp as a side effect. This should implement such a conversion, predicated on the "minsize" function attribute (-Oz) since I've yet to find any code it actually makes faster. llvm-svn: 194264	2013-11-08 17:18:07 +00:00
Artyom Skrobov	202ff08f97	[ARM] Handling for coprocessor instructions that are undefined starting from ARMv8 (Thumb encodings) llvm-svn: 194263	2013-11-08 16:25:50 +00:00
Artyom Skrobov	e686cec7d4	[ARM] Handling for coprocessor instructions that are undefined starting from ARMv8 (ARM encodings) llvm-svn: 194261	2013-11-08 16:16:30 +00:00
Artyom Skrobov	8653443902	[ARM] In ARMAsmParser, MatchCoprocessorOperandName() permitted p10 and p11 as operands for coprocessor instructions, resulting in encodings that clash with FP/NEON instruction encodings llvm-svn: 194253	2013-11-08 09:16:31 +00:00
Tim Northover	f02287db27	ARM: permit bare dmb/dsb/isb aliases on Cortex-M0 Cortex-M0 supports these 32-bit instructions despite being Thumb1 only (mostly). We knew about that but not that the aliases without the default "sy" operand were also permitted. llvm-svn: 194094	2013-11-05 21:36:02 +00:00
Tim Northover	c9432eb9e5	ARM: remove unnecessary state-tracking during frame lowering. ResolveFrameIndex had what appeared to be a very nasty hack for when the frame-index referred to a callee-saved register. In this case it "adjusted" the offset so that the address was correct if (and only if) the MachineInstr immediately followed the respective push. This "worked" for all forms of GPR & DPR but was only ever used to set the frame pointer itself, and once this was put in a more sensible location the entire state-tracking machinery it relied on became redundant. So I stripped it. The only wrinkle is that "add r7, sp, #0" might theoretically be slower (need an actual ALU slot) compared to "mov r7, sp" so I added a micro-optimisation that also makes emitARMRegUpdate and emitT2RegUpdate also work when NumBytes == 0. No test changes since there shouldn't be any functionality change. llvm-svn: 194025	2013-11-04 23:04:15 +00:00
Bob Wilson	e7dde0c061	Enable optimization of sin / cos pair into call to __sincos_stret for iOS7+. rdar://12856873 Patch by Evan Cheng, with a fix for rdar://13209539 by Tilmann Scheller llvm-svn: 193942	2013-11-03 06:14:38 +00:00
Bradley Smith	2521975a42	[ARM] Add Virtualization subtarget feature and more build attributes in this area Add a Virtualization ARM subtarget feature along with adding proper build attribute emission for Tag_Virtualization_use (encodes Virtualization and TrustZone) and Tag_MPextension_use. Also rework test/CodeGen/ARM/2010-10-19-mc-elf-objheader.ll testcase to something that is more maintainable. This changes the focus of this testcase away from testing CPU defaults (which is tested elsewhere), onto specifically testing that attributes are encoded correctly. llvm-svn: 193859	2013-11-01 13:27:35 +00:00
Bradley Smith	c848beba5e	[ARM] Fix Tag_ABI_HardFP_use build attribute Fix Tag_ABI_HardFP_use build attribute to handle single precision FP, replace deprecated Tag_ABI_HardFP_use value of 3 with 0 and also add some tests for Tag_ABI_VFP_args. llvm-svn: 193856	2013-11-01 11:21:16 +00:00
Jim Grosbach	7236678687	Legalize: Improve legalization of long vector extends. When an extend more than doubles the size of the elements (e.g., a zext from v16i8 to v16i32), the normal legalization method of splitting the vectors will run into problems as by the time the destination vector is legal, the source vector is illegal. The end result is the operation often becoming scalarized, with the typical horrible performance. For example, on x86_64, the simple input of: define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind { %tmp = zext <16 x i8> %a to <16 x i32> store <16 x i32> %tmp, <16 x i32>*%p ret void } Generates: .section __TEXT,__text,regular,pure_instructions .section __TEXT,__const .align 5 LCPI0_0: .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .section __TEXT,__text,regular,pure_instructions .globl _bar .align 4, 0x90 _bar: vpunpckhbw %xmm0, %xmm0, %xmm1 vpunpckhwd %xmm0, %xmm1, %xmm2 vpmovzxwd %xmm1, %xmm1 vinsertf128 $1, %xmm2, %ymm1, %ymm1 vmovaps LCPI0_0(%rip), %ymm2 vandps %ymm2, %ymm1, %ymm1 vpmovzxbw %xmm0, %xmm3 vpunpckhwd %xmm0, %xmm3, %xmm3 vpmovzxbd %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vandps %ymm2, %ymm0, %ymm0 vmovaps %ymm0, (%rdi) vmovaps %ymm1, 32(%rdi) vzeroupper ret So instead we can check if there are legal types that enable us to split more cleverly when the input vector is already legal such that we don't turn it into an illegal type. If the extend is such that it's more than doubling the size of the input we check if - the number of vector elements is even, - the source type is legal, - the type of a split source is illegal, - the type of an extended (by doubling element size) source is legal, and - the type of that extended source when split is legal. If the conditions are met, instead of just splitting both the destination and the source types, we create an extend that only goes up one "step" (doubling the element width), and the continue legalizing the rest of the operation normally. The result is that this operates as a new, more effecient, termination condition for the loop of "split the operation until the destination type is legal." With this change, the above example now compiles to: _bar: vpxor %xmm1, %xmm1, %xmm1 vpunpcklbw %xmm1, %xmm0, %xmm2 vpunpckhwd %xmm1, %xmm2, %xmm3 vpunpcklwd %xmm1, %xmm2, %xmm2 vinsertf128 $1, %xmm3, %ymm2, %ymm2 vpunpckhbw %xmm1, %xmm0, %xmm0 vpunpckhwd %xmm1, %xmm0, %xmm3 vpunpcklwd %xmm1, %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vmovaps %ymm0, 32(%rdi) vmovaps %ymm2, (%rdi) vzeroupper ret This generalizes a custom lowering that was added a while back to the ARM backend. That lowering is no longer necessary, and is removed. The testcases for it, however, provide excellent ARM tests for this change and so remain. rdar://14735100 llvm-svn: 193727	2013-10-31 00:20:48 +00:00
Artyom Skrobov	c1be9c16bc	[ARM] NEON instructions were erroneously decoded from certain invalid encodings llvm-svn: 193705	2013-10-30 18:10:09 +00:00
Manman Ren	b504f49448	Struct byval cleanup: add helper functions to reduce code duplication. Helper functions are added: emitPostLd: emit a post-increment load operation with given size. emitPostSt: emit a post-increment store operation with given size. No functionality change. llvm-svn: 193656	2013-10-29 22:27:32 +00:00
Rafael Espindola	e133ed88b5	Move getSymbol to TargetLoweringObjectFile. This allows constructing a Mangler with just a TargetMachine. llvm-svn: 193630	2013-10-29 17:28:26 +00:00
Rafael Espindola	79858aa3df	Add a helper getSymbol to AsmPrinter. llvm-svn: 193627	2013-10-29 17:07:16 +00:00
Amara Emerson	f9a67fce26	[ARM] Make sure HasCRC is initialized to false in Subtarget. llvm-svn: 193624	2013-10-29 16:54:52 +00:00
Bernard Ogden	ee87e85505	ARM: Add subtarget feature for CRC Adds a subtarget feature for the CRC instructions (optional in v8-A) to the ARM (32-bit) backend. Differential Revision: http://llvm-reviews.chandlerc.com/D2036 llvm-svn: 193599	2013-10-29 09:47:35 +00:00
Arnold Schwaighofer	89ae217422	ARM cost model: Unaligned vectorized double stores are expensive Updated a test case that assumed that <2 x double> would vectorize to use <4 x float>. radar://15338229 llvm-svn: 193574	2013-10-29 01:33:57 +00:00
Arnold Schwaighofer	77af0f6e82	ARM cost model: Account for zero cost scalar SROA instructions By vectorizing a series of srl, or, ... instructions we have obfuscated the intention so much that the backend does not know how to fold this code away. radar://15336950 llvm-svn: 193573	2013-10-29 01:33:53 +00:00
Lang Hames	b52816615b	Return early from getUnconditionalBranchTargetOpValue if the branch target is an MCExpr, in order to avoid writing an encoded zero value in the immediate field. When getUnconditionalBranchTargetOpValue is called with an MCExpr target, we don't know what the final immediate field value should be. We shouldn't explicitly set the immediate field to an encoded zero value as zero is encoded with a non-zero bit pattern. This leads to bits being set that pollute the final immediate value. The nature of the encoding is such that the polluted bits only affect very large immediate values, explaining why this hasn't caused problems earlier. Fixes <rdar://problem/15155975>. llvm-svn: 193535	2013-10-28 20:51:11 +00:00
Logan Chien	8cbb80d159	[arm] Implement eabi_attribute, cpu, and fpu directives. This commit allows the ARM integrated assembler to parse and assemble the code with .eabi_attribute, .cpu, and .fpu directives. To implement the feature, this commit moves the code from AttrEmitter to ARMTargetStreamers, and several new test cases related to cortex-m4, cortex-r5, and cortex-a15 are added. Besides, this commit also change the Subtarget->isFPOnlySP() to Subtarget->hasD16() to match the usage of .fpu directive. This commit changes the test cases: * Several .eabi_attribute directives in 2010-09-29-mc-asm-header-test.ll are removed because the .fpu directive already cover the functionality. * In the Cortex-A15 test case, the value for Tag_Advanced_SIMD_arch has be changed from 1 to 2, which is more precise. llvm-svn: 193524	2013-10-28 17:51:12 +00:00
Tim Northover	1744d0ad83	ARM: allow .thumb_func to be separated from symbol definition When assembling, a .thumb_func directive is supposed to be applicable to the next symbol definition, even if there are intervening directives. We were racing ahead to try and find it, and this commit should fix the issue. Patch by Gabor Ballabas llvm-svn: 193403	2013-10-25 12:49:50 +00:00
Tim Northover	c7ea8048e7	ARM: don't expand atomicrmw inline on Cortex-M0 There's a barrier instruction so that should still be used, but most actual atomic operations are going to need a platform decision on the correct behaviour (either nop if single-threaded or OS-support otherwise). rdar://problem/15287210 llvm-svn: 193399	2013-10-25 09:30:24 +00:00
Jim Grosbach	1d1d6d4675	ARM: Tweak usage of '*vfp' compiler_rt functions. Only use them if the subtarget has ARM mode, as these routines are implemented as ARM code. rdar://15302004 llvm-svn: 193381	2013-10-24 23:07:11 +00:00
David Peixotto	b0653e539b	Remove class abstraction from ARM struct byval lowering This commit changes the struct byval lowering for arm to use inline checks for the subtarget instead of a class abstraction to represent the differences. The class abstraction was judged to be too much code for this task. No intended functionality change. llvm-svn: 193357	2013-10-24 16:39:36 +00:00
Tim Northover	5620faf771	ARM: Mark double-precision instructions as such This prevents us from silently accepting invalid instructions on (for example) Cortex-M4 with just single-precision VFP support. No tests for the extra Pat Requires because they're essentially assertions: the affected code should have been lowered to libcalls before ISel. rdar://problem/15302004 llvm-svn: 193354	2013-10-24 15:49:39 +00:00
Tim Northover	225bcbbe71	ARM: add a couple more NEON predicates. The fused multiply instructions were added in VFPv4 but are still NEON instructions, in particular they shouldn't be available on a Cortex-M4 not matter how floaty it is. llvm-svn: 193342	2013-10-24 12:48:05 +00:00
Tim Northover	64dacb2b8a	ARM: mark various aliases with their architecture requirements. If an alias inherits directly from InstAlias then it doesn't get any default "Requires" values, so llvm-mc will allow it even on architectures that don't support the underlying instruction. This tidies up the obvious VFP and NEON cases I found. llvm-svn: 193340	2013-10-24 12:22:58 +00:00
Tim Northover	94ecbd2e6c	ARM: Use non-VFP softcalls on embedded Darwinish targets The compiler-rt functions __adddf3vfp and so on exist purely to allow Thumb1 code to make use of VFP instructions by switching back to ARM mode, they make no sense for M-class processors which don't even have an ARM mode. Given that justification, in practice this is a platform ABI decision so the actual check is based on that rather than CPU features. rdar://problem/15302004 llvm-svn: 193327	2013-10-24 10:37:09 +00:00
Tim Northover	741e6ef4d4	ARM: fix assert on unpredictable POP instruction. POP instructions are aliased to the ARM LDM variants but have different syntax. This caused two problems: we tried to access a non-existent operand to annotate the '!', and the error message didn't make much sense. With some vigorous hand-waving in the error message both problems can be fixed. llvm-svn: 193322	2013-10-24 09:37:18 +00:00
Artyom Skrobov	fc12e7016c	Make ARM hint ranges consistent, and add tests for these ranges llvm-svn: 193238	2013-10-23 10:14:40 +00:00
Tim Northover	08a8660260	ARM: provide diagnostics on more writeback LDM/STM instructions The set of circumstances where the writeback register is allowed to be in the list of registers is rather baroque, but I think this implements them all on the assembly parsing side. For disassembly, we still warn about an ARM-mode LDM even if the architecture revision is < v7 (the required architecture information isn't available). It's a silly instruction anyway, so hopefully no-one will mind. rdar://problem/15223374 llvm-svn: 193185	2013-10-22 19:00:39 +00:00
Jim Grosbach	dba14ddd4f	ARM: Thumb2 copy for GPRPair needs to use thumb instructions. Use tMOVr instead of plain MOVr. rdar://15193017 llvm-svn: 193139	2013-10-22 02:29:37 +00:00
Jim Grosbach	8815bef000	ARM: Clean up copyPhysReg() a bit. No functional change, just cleaning things up for readability. llvm-svn: 193138	2013-10-22 02:29:35 +00:00
Richard Barton	a661b44a5d	Pure refactoring change. Patch by Artyom Skrobov. llvm-svn: 192977	2013-10-18 14:41:50 +00:00
Richard Barton	87dacc38b8	Add hint disassembly syntax for 16-bit Thumb hint instructions. Patch by Artyom Skrobov llvm-svn: 192972	2013-10-18 14:09:49 +00:00
Silviu Baranga	314e58fdcc	Add hardware division as a default feature on Cortex-A15. Also add test cases to check this, and change diagnostics for the hwdiv-arm feature to something useful. llvm-svn: 192963	2013-10-18 10:18:40 +00:00
David Peixotto	8e5abc52cb	17309 ARM backend incorrectly lowers COPY_STRUCT_BYVAL_I32 for thumb1 targets This commit implements the correct lowering of the COPY_STRUCT_BYVAL_I32 pseudo-instruction for thumb1 targets. Previously, the lowering of COPY_STRUCT_BYVAL_I32 generated the post-increment forms of ldr/ldrh/ldrb instructions. Thumb1 does not have the post-increment form of these instructions so the generated assembly contained invalid instructions. Passing the generated assembly to gcc caused it to complain with an error like this: Error: cannot honor width suffix -- `ldrb r3,[r0],#1' and the integrated assembler would generate an object file with an invalid instruction encoding. This commit contains a small test case that demonstrates the problem with thumb1 targets as well as an expanded test case that more throughly tests the lowering of byval struct passing for arm, thumb1, and thumb2 targets. llvm-svn: 192916	2013-10-17 19:52:05 +00:00
David Peixotto	c32e24a1b7	Refactor lowering for COPY_STRUCT_BYVAL_I32 This commit refactors the lowering of the COPY_STRUCT_BYVAL_I32 pseudo-instruction in the ARM backend. We introduce a new helper class that encapsulates all of the operations needed during the lowering. The operations are implemented for each subtarget in different subclasses. Currently only arm and thumb2 subtargets are supported. This refactoring was done to easily implement support for thumb1 subtargets. This initial patch does not add support for thumb1, but is only a refactoring. A follow on patch will implement the support for thumb1 subtargets. No intended functionality change. llvm-svn: 192915	2013-10-17 19:49:22 +00:00
Rafael Espindola	43c4e24fad	Add a MCAsmInfoELF class and factor some code into it. We had a MCAsmInfoCOFF, but no common class for all the ELF MCAsmInfos before. llvm-svn: 192760	2013-10-16 01:34:32 +00:00
Manman Ren	fd956dbae0	Struct byval: fix a copy-paste error for thumb2. PR17309 llvm-svn: 192730	2013-10-15 19:42:32 +00:00
Bernard Ogden	53169762d0	Add Cortex-A57 support llvm-svn: 192591	2013-10-14 13:17:07 +00:00
Bernard Ogden	4400cde89a	Add subtarget feature support for Cortex-A53 Some previous implicit defaults have changed, for example FP and NEON are now on by default. llvm-svn: 192590	2013-10-14 13:16:57 +00:00
Amara Emerson	ac6950863f	[ARM] Fix FP ABI attributes with no VFP enabled. llvm-svn: 192458	2013-10-11 16:03:43 +00:00
Benjamin Kramer	b3b79a4345	ARM: Put isV8EligibleForIT into the llvm namespace. While there make it inline. llvm-svn: 192350	2013-10-10 14:35:45 +00:00
Tim Northover	569f69dace	ARM: correct liveness flags during ARMLoadStoreOpt When we had a sequence like: s1 = VLDRS [r0, 1], Q0<imp-def> s3 = VLDRS [r0, 2], Q0<imp-use,kill>, Q0<imp-def> s0 = VLDRS [r0, 0], Q0<imp-use,kill>, Q0<imp-def> s2 = VLDRS [r0, 4], Q0<imp-use,kill>, Q0<imp-def> we were gathering the {s0, s1} loads below the s3 load. This is fine, but confused the verifier since now the s3 load had Q0<imp-use> with no definition above it. This should mark such uses <undef> as well. The liveness structure at the beginning and end of the block is unaffected, and the true sN definitions should prevent any dodgy reorderings being introduced elsewhere. rdar://problem/15124449 llvm-svn: 192344	2013-10-10 09:28:20 +00:00
Benjamin Kramer	4188293c72	Flip the ownership of MCStreamer and MCTargetStreamer. MCStreamer now owns the target streamer. This prevents leaking the target streamer. llvm-svn: 192303	2013-10-09 17:23:41 +00:00
Rafael Espindola	a17151ad5a	Add a MCTargetStreamer interface. This patch fixes an old FIXME by creating a MCTargetStreamer interface and moving the target specific functions for ARM, Mips and PPC to it. The ARM streamer is still declared in a common place because it is used from lib/CodeGen/ARMException.cpp, but the Mips and PPC are completely hidden in the corresponding Target directories. I will send an email to llvmdev with instructions on how to use this. llvm-svn: 192181	2013-10-08 13:08:17 +00:00
Manman Ren	5a78755336	Struct byval: use the correct alignment for loads generated to load from struct byval to registers. We used to pass 0 which means the alignment of PtrVT. Even when the alignment of the struct is smaller than 4, the LOADs would have alignment of 4, and further optimizations could combine the LOADs into a ldm, which would cause crash. The fix is to pass the alignment of the struct byval. rdar://problem/15144402 llvm-svn: 192126	2013-10-07 19:47:53 +00:00
Amara Emerson	5035ee0212	[ARM] Improve build attributes emission. llvm-svn: 192111	2013-10-07 16:55:23 +00:00
Rafael Espindola	e90fd9c5e0	Remove getEHExceptionRegister and getEHHandlerRegister. They haven't been used for a long time. Patch by MathOnNapkins. llvm-svn: 192099	2013-10-07 13:39:22 +00:00
Tim Northover	f86d1f0b77	ARM: allow cortex-m0 to use hint instructions The hint instructions ("nop", "yield", etc) are mostly Thumb2-only, but have been ported across to the v6M architecture. Fortunately, v6M seems to sit nicely between v6 (thumb-1 only) and v6T2, so we can add a feature for it fairly easily. rdar://problem/15144406 llvm-svn: 192097	2013-10-07 11:10:47 +00:00
Rafael Espindola	ac4ad25a00	Remove some really nasty uses of hasRawTextSupport. When MC was first added, targets could use hasRawTextSupport to keep features working before they were added to the MC interface. The design goal of MC is to provide an uniform api for printing assembly and object files. Short of relaxations and other corner cases, a object file is just another representation of the assembly. It was never the intention that targets would keep doing things like if (hasRawTextSupport()) Set flags in one way. else Set flags in another way. When they do that they create two code paths and the object file is no longer just another representation of the assembly. This also then requires testing with llc -filetype=obj, which is extremelly brittle. This patch removes some of these hacks by replacing them with smaller ones. The ARM flag setting is trivial, so I just moved it to the constructor. For Mips, the patch adds two temporary hack directives that allow the assembly to represent the same things as the object file was already able to. The hope is that the mips developers will replace the hack directives with the same ones that gas uses and drop the -print-hack-directives flag. I will also try to implement a target streamer interface, so that we can move this out of the common code. In summary, for any new work, two rules of the thumb are * Don't use "llc -filetype=obj" in tests. * Don't add calls to hasRawTextSupport. llvm-svn: 192035	2013-10-05 16:42:21 +00:00
Matthias Braun	2f169f900b	ARM: optimizeSelect has to consider the previous register class optimizeSelect folds (predicated) copy instructions, it must not ignore the original register class of the operand when replacing the register with the copies dest register. llvm-svn: 191963	2013-10-04 16:52:56 +00:00
Matthias Braun	c22630e164	ARM: do not add a regmask for TAILJUMPs The jump doesn't really kill the registers, the following call does but we never get back anyway. This avoids some verify-machineinstrs problems when TAILJUMPs are if-converted. llvm-svn: 191962	2013-10-04 16:52:54 +00:00
Matthias Braun	da621165ca	ARM: preserve undef flag in pseudo instruction expanders Copy over the whole register machine operand instead of creating a new one with an incomplete set of flags. llvm-svn: 191961	2013-10-04 16:52:51 +00:00
Amara Emerson	52cfb6a99a	[ARM] Warn on deprecated IT blocks in v8 AArch32 assembly. Patch by Artyom Skrobov. llvm-svn: 191885	2013-10-03 09:31:51 +00:00
Tim Northover	d840745829	ARM: support interrupt attribute This function-attribute modifies the callee-saved register list and function epilogue (specifically the return instruction) so that a routine is suitable for use as an interrupt-handler of the specified type without disrupting user-mode applications. rdar://problem/14207019 llvm-svn: 191766	2013-10-01 14:33:28 +00:00
Joey Gouly	510de640c3	[ARM] Remove an unused function from the disassembler. Pointed out by Joerg. llvm-svn: 191749	2013-10-01 13:01:10 +00:00
Joey Gouly	ad98f1671d	[ARM] Introduce the 'sevl' instruction in ARMv8. This also removes the restriction on the immediate field of the 'hint' instruction. llvm-svn: 191744	2013-10-01 12:39:11 +00:00
Tilmann Scheller	be904775d2	[ARM] Clean up ARMAsmParser::validateInstruction(). Fix some LLVM Coding Standards violations. No changes in functionality. llvm-svn: 191686	2013-09-30 17:57:30 +00:00
Tilmann Scheller	255722beb8	[ARM] Assembler: ARM LDRD with writeback requires the base register to be different from the destination registers. See ARM ARM A8.8.72. Violating this constraint results in unpredictable behavior. llvm-svn: 191678	2013-09-30 16:11:48 +00:00
Arnold Schwaighofer	66eb921a82	Swift model: Fix uop description on some writes Those writes really need two/three uops. llvm-svn: 191677	2013-09-30 15:56:34 +00:00

1 2 3 4 5 ...

7122 Commits