llvm-project

Commit Graph

Author	SHA1	Message	Date
Sebastian Pop	957a6583f1	updated patch for the ARM fused multiply add/sub In this update: - I assumed neon2 does not imply vfpv4, but neon and vfpv4 imply neon2. - I kept setting .fpu=neon-vfpv4 code attribute because that is what the assembler understands. Patch by Ana Pazos <apazos@codeaurora.org> llvm-svn: 152036	2012-03-05 17:39:52 +00:00
Evan Cheng	65f9d19c4f	Re-commit r151623 with fix. Only issue special no-return calls if it's a direct call. llvm-svn: 151645	2012-02-28 18:51:51 +00:00
Daniel Dunbar	ee7b899343	Revert r151623 "Some ARM implementaions, e.g. A-series, does return stack prediction. ...", it is breaking the Clang build during the Compiler-RT part. llvm-svn: 151630	2012-02-28 15:36:07 +00:00
Evan Cheng	87c7b09d8d	Some ARM implementaions, e.g. A-series, does return stack prediction. That is, the processor keeps a return addresses stack (RAS) which stores the address and the instruction execution state of the instruction after a function-call type branch instruction. Calling a "noreturn" function with normal call instructions (e.g. bl) can corrupt RAS and causes 100% return misprediction so LLVM should use a unconditional branch instead. i.e. mov lr, pc b _foo The "mov lr, pc" is issued in order to get proper backtrace. rdar://8979299 llvm-svn: 151623	2012-02-28 06:42:03 +00:00
Jia Liu	b22310fda6	Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore. llvm-svn: 150878	2012-02-18 12:03:15 +00:00
Anton Korobeynikov	5482b9f535	Add fused multiple+add instructions from VFPv4. Patch by Ana Pazos! llvm-svn: 148658	2012-01-22 12:07:33 +00:00
Evan Cheng	68132d8093	ARM target code clean up. Check for iOS, not Darwin where it makes sense. llvm-svn: 146981	2011-12-20 18:26:50 +00:00
Evan Cheng	94307f6ba6	Hide cpu name checking in ARMSubtarget. llvm-svn: 144154	2011-11-09 01:57:03 +00:00
David Meyer	49045ddb4c	Remove NaClMode llvm-svn: 142338	2011-10-18 05:29:23 +00:00
Bob Wilson	8decdc472f	Reenable tail calls for iOS 5.0 and later. llvm-svn: 141370	2011-10-07 17:17:49 +00:00
James Molloy	21efa7d6e1	Check in a patch that has already been code reviewed by Owen that I'd forgotten to commit. Build on previous patches to successfully distinguish between an M-series and A/R-series MSR and MRS instruction. These take different mask names and have a slightly different opcode format. Add decoder and disassembler tests. Improvement on the previous patch - successfully distinguish between valid v6m and v7m masks (one is a subset of the other). The patch had to be edited slightly to apply to ToT. llvm-svn: 140696	2011-09-28 14:21:38 +00:00
Nick Lewycky	73df7e3830	Add a new MC bit for NaCl (Native Client) mode. NaCl requires that certain instructions are more aligned than the CPU requires, and adds some additional directives, to follow in future patches. Patch by David Meyer! llvm-svn: 139125	2011-09-05 21:51:43 +00:00
Evan Cheng	6dbe713a49	Rewrite comment in English. llvm-svn: 134627	2011-07-07 19:09:06 +00:00
Evan Cheng	1834f5dcb6	Rename attribute 'thumb' to a more descriptive 'thumb-mode'. llvm-svn: 134626	2011-07-07 19:05:12 +00:00
Evan Cheng	1a72add615	Compute feature bits at time of MCSubtargetInfo initialization. llvm-svn: 134606	2011-07-07 07:07:08 +00:00
Evan Cheng	8b2bda09a5	Change some ARM subtarget features to be single bit yes/no in order to sink them down to MC layer. Also fix tests. llvm-svn: 134590	2011-07-07 03:55:05 +00:00
Evan Cheng	2bd65363a8	Factor ARM triple parsing out of ARMSubtarget. Another step towards making ARM subtarget info available to MC. llvm-svn: 134569	2011-07-07 00:08:19 +00:00
Evan Cheng	c9c090d7a5	Rename XXXGenSubtarget.inc to XXXGenSubtargetInfo.inc for consistency. llvm-svn: 134281	2011-07-01 22:36:09 +00:00
Jim Grosbach	cf1464d943	ARMv7M vs. ARMv7E-M support. The DSP instructions in the Thumb2 instruction set are an optional extension in the Cortex-M* archtitecture. When present, the implementation is considered an "ARMv7E-M implementation," and when not, an "ARMv7-M implementation." Add a subtarget feature hook for the v7e-m instructions and hook it up. The cortex-m3 cpu is an example of a v7m implementation, while the cortex-m4 is a v7e-m implementation. rdar://9572992 llvm-svn: 134261	2011-07-01 21:12:19 +00:00
Evan Cheng	0d639a28aa	Rename TargetSubtarget to TargetSubtargetInfo for consistency. llvm-svn: 134259	2011-07-01 21:01:15 +00:00
Evan Cheng	54b68e3432	- Added MCSubtargetInfo to capture subtarget features and scheduling itineraries. - Refactor TargetSubtarget to be based on MCSubtargetInfo. - Change tablegen generated subtarget info to initialize MCSubtargetInfo and hide more details from targets. llvm-svn: 134257	2011-07-01 20:45:01 +00:00
Evan Cheng	fe6e405e8c	Fix the ridiculous SubtargetFeatures API where it implicitly expects CPU name to be the first encoded as the first feature. It then uses the CPU name to look up features / scheduling itineray even though clients know full well the CPU name being used to query these properties. The fix is to just have the clients explictly pass the CPU name! llvm-svn: 134127	2011-06-30 01:53:36 +00:00
Evan Cheng	8264e272a9	Sink SubtargetFeature and TargetInstrItineraries (renamed MCInstrItineraries) into MC. llvm-svn: 134049	2011-06-29 01:14:12 +00:00
Evan Cheng	4fcd8250ae	Revert accidental commit. llvm-svn: 131739	2011-05-20 17:38:48 +00:00
Evan Cheng	e8d2e9eb35	Revert r131664 and fix it in instcombine instead. rdar://9467055 llvm-svn: 131708	2011-05-20 00:54:37 +00:00
Evan Cheng	5f1ba4cd2d	Remove -use-divmod-libcall. Let targets opt in when they are available. llvm-svn: 129884	2011-04-20 22:20:12 +00:00
Daniel Dunbar	2b9b0e3748	ADT/Triple: Move a variety of clients to using isOSDarwin() and isOSWindows() predicates. llvm-svn: 129816	2011-04-19 21:14:45 +00:00
Bob Wilson	a2881ee8a4	Avoid some 's' 16-bit instruction which partially update CPSR (and add false dependency) when it isn't dependent on last CPSR defining instruction. rdar://8928208 llvm-svn: 129773	2011-04-19 18:11:49 +00:00
Evan Cheng	38bf5adcea	Distribute (A + B) * C to (A * C) + (B * C) to make use of NEON multiplier accumulator forwarding: vadd d3, d0, d1 vmul d3, d3, d2 => vmul d3, d0, d2 vmla d3, d1, d2 llvm-svn: 128665	2011-03-31 19:38:48 +00:00
Evan Cheng	e45d685895	Clean up ARM subtarget code by using Triple ADT. llvm-svn: 123276	2011-01-11 21:46:47 +00:00
Andrew Trick	10ffc2b6c2	Various bits of framework needed for precise machine-level selection DAG scheduling during isel. Most new functionality is currently guarded by -enable-sched-cycles and -enable-sched-hazard. Added InstrItineraryData::IssueWidth field, currently derived from ARM itineraries, but could be initialized differently on other targets. Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is active, and if so how many cycles of state it holds. Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry into the scheduler's available queue. ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to get information about it's SUnits, provides RecedeCycle for bottom-up scheduling, correctly computes scoreboard depth, tracks IssueCount, and considers potential stall cycles when checking for hazards. ScheduleDAGRRList now models machine cycles and hazards (under flags). It tracks MinAvailableCycle, drives the hazard recognizer and priority queue's ready filter, manages a new PendingQueue, properly accounts for stall cycles, etc. llvm-svn: 122541	2010-12-24 05:03:26 +00:00
Andrew Trick	c416ba612b	whitespace llvm-svn: 122539	2010-12-24 04:28:06 +00:00
Evan Cheng	62c7b5bf76	Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960	2010-12-05 22:04:16 +00:00
Evan Cheng	8740ee3637	Fix preload instruction isel. Only v7 supports pli, and only v7 with mp extension supports pldw. Add subtarget attribute to denote mp extension support and legalize illegal ones to nothing. llvm-svn: 118160	2010-11-03 06:34:55 +00:00
Bob Wilson	dd6eb5b5a1	PR8359: The ARM backend may end up allocating registers D16 to D31 when "-mattr=+vfp3" is specified. However, this will not work for hardware that only supports 16 registers. Add a new flag to support -"mattr=+vfp3,+d16". Patch by Jan Voung! llvm-svn: 116310	2010-10-12 16:22:47 +00:00
Rafael Espindola	66e08d43d2	Jim Asked us to move DataLayout on ARM back to the most specialized classes. Do so and also change X86 for consistency. Investigating if this can be improved a bit. llvm-svn: 115469	2010-10-03 18:59:45 +00:00
Bob Wilson	97bf273870	Increase ARM APCS preferred alignment for i64 and f64 from 32 bits to 64 bits. LDM/STM instructions can run one cycle faster on some ARM processors if the memory address is 64-bit aligned. Radar 8489376. llvm-svn: 115047	2010-09-29 17:54:10 +00:00
Owen Anderson	a3181e2d79	Add a subtarget hook for reporting the misprediction penalty. Use this to provide more precise cost modeling for if-conversion. Now if only we had a way to estimate the misprediction probability. Adjsut CodeGen/ARM/ifcvt10.ll. The pipeline on Cortex-A8 is long enough that it is still profitable to predicate an ldm, but the shorter pipeline on Cortex-A9 makes it unprofitable. llvm-svn: 114995	2010-09-28 21:57:50 +00:00
Bob Wilson	3dc97324c1	Add a command line option "-arm-strict-align" to disallow unaligned memory accesses for ARM targets that would otherwise allow it. Radar 8465431. llvm-svn: 114941	2010-09-28 04:09:35 +00:00
Daniel Dunbar	6b2aaf1a36	Hard to imagine there are still people using inferior compilers. llvm-svn: 114862	2010-09-27 20:12:58 +00:00
Rafael Espindola	69aa15155f	Odd additional stub framework for the ARM MC ELF emission. llc now recognizes the "intent" to support MC/obj emission for ARM, but given that they are all stubs, it asserts on --filetype=obj --march=arm Patch by Jason Kim. llvm-svn: 114856	2010-09-27 18:31:37 +00:00
Evan Cheng	bf4070756f	Teach if-converter to be more careful with predicating instructions that would take multiple cycles to decode. For the current if-converter clients (actually only ARM), the instructions that are predicated on false are not nops. They would still take machine cycles to decode. Micro-coded instructions such as LDM / STM can potentially take multiple cycles to decode. If-converter should take treat them as non-micro-coded simple instructions. llvm-svn: 113570	2010-09-10 01:29:16 +00:00
Jim Grosbach	4d5dc3e7e5	cortex m4 has floating point support, but only single precision. llvm-svn: 110810	2010-08-11 15:44:15 +00:00
Evan Cheng	5190f09291	Report error if codegen tries to instantiate a ARM target when the cpu does support it. e.g. cortex-m* processors. llvm-svn: 110798	2010-08-11 07:17:46 +00:00
Evan Cheng	40921a4e62	Add ARM Archv6M and let it implies FeatureDB (having dmb, etc.) llvm-svn: 110795	2010-08-11 06:51:54 +00:00
Evan Cheng	6e809de90c	- Add subtarget feature -mattr=+db which determine whether an ARM cpu has the memory and synchronization barrier dmb and dsb instructions. - Change instruction names to something more sensible (matching name of actual instructions). - Added tests for memory barrier codegen. llvm-svn: 110785	2010-08-11 06:22:01 +00:00
Evan Cheng	ce8fb68078	Change -prefer-32bit-thumb to attribute -mattr=+32bit instead to disable more 32-bit to 16-bit optimizations. llvm-svn: 110584	2010-08-09 18:35:19 +00:00
Evan Cheng	58066e337d	Add an ARM "feature". Cortex-a8 fp comparison is very slow (> 20 cycles). llvm-svn: 108256	2010-07-13 19:21:50 +00:00
Shantonu Sen	94231eec1f	Fix "warning: extra ';' inside a struct or union" when building llvm with clang llvm-svn: 103179	2010-05-06 14:57:47 +00:00
Jim Grosbach	151cd8f159	Cleanup of ARMv7M support. Move hardware divide and Thumb2 extract/pack instructions to subtarget features and update tests to reflect. PR5717. llvm-svn: 103136	2010-05-05 23:44:43 +00:00
Jim Grosbach	92d999001c	Add initial support for ARMv7M subtarget and cortex-m3 cpu. Patch by Jordy <snhjordy@gmail.com>. Followup patches will add some tests and adjust to use Subtarget features for the instructions. llvm-svn: 103119	2010-05-05 20:44:35 +00:00
Dan Gohman	bcaf681cde	Add const qualifiers to CodeGen's use of LLVM IR constructs. llvm-svn: 101334	2010-04-15 01:51:59 +00:00
Jim Grosbach	a43386ba8f	switch the use-vml[as] instructions flag to a subtarget 'feature' llvm-svn: 99565	2010-03-25 23:11:16 +00:00
Jim Grosbach	34de7768bf	Make the use of the vmla and vmls VFP instructions controllable via cmd line. Preliminary testing shows significant performance wins by not using these instructions. llvm-svn: 99436	2010-03-24 22:31:46 +00:00
Anton Korobeynikov	0a65a37344	Add substarget feature for FP16 llvm-svn: 98503	2010-03-14 18:42:38 +00:00
Bob Wilson	c499fae068	Lower small memcpys to load/stores on Thumb2. Radar 7686922. llvm-svn: 98210	2010-03-11 00:20:49 +00:00
Anton Korobeynikov	bf16a17fc1	Initial bits of ARMv4-only support. Patch by John Tytgat! llvm-svn: 97886	2010-03-06 19:39:36 +00:00
Bob Wilson	505ddaa4dc	Remove isProfitableToDuplicateIndirectBranch target hook. It is profitable for all the processors where I have tried it, and even when it might not help performance, the cost is quite low. The opportunities for duplicating indirect branches are limited by other factors so code size does not change much due to tail duplicating indirect branches aggressively. llvm-svn: 90144	2009-11-30 18:35:03 +00:00
Anton Korobeynikov	2522908653	Materialize global addresses via movt/movw pair, this is always better than doing the same via constpool: 1. Load from constpool costs 3 cycles on A9, movt/movw pair - just 2. 2. Load from constpool might stall up to 300 cycles due to cache miss. 3. Movt/movw does not use load/store unit. 4. Less constpool entries => better compiler performance. This is only enabled on ELF systems, since darwin does not have needed relocations (yet). llvm-svn: 89720	2009-11-24 00:44:37 +00:00
Bob Wilson	290e9a47a9	Add a target hook to allow changing the tail duplication limit based on the contents of the block to be duplicated. Use this for ARM Cortex A8/9 to be more aggressive tail duplicating indirect branches, since it makes it much more likely that they will be predicted in the branch target buffer. Testcase coming soon. llvm-svn: 89187	2009-11-18 03:34:27 +00:00
David Goodwin	b9fe5d5d02	Allow target to specify regclass for which antideps will only be broken along the critical path. llvm-svn: 88682	2009-11-13 19:52:48 +00:00
David Goodwin	0d412c2528	Fixed to address code review. No functional changes. llvm-svn: 86634	2009-11-10 00:48:55 +00:00
David Goodwin	cf89db135e	Allow targets to specify register classes whose member registers should not be renamed to break anti-dependencies. llvm-svn: 86628	2009-11-10 00:15:47 +00:00
David Goodwin	8370485db9	Break anti-dependence breaking out into its own class. llvm-svn: 85127	2009-10-26 16:59:04 +00:00
David Goodwin	02ad4cb32e	Allow the target to select the level of anti-dependence breaking that should be performed by the post-RA scheduler. The default is none. llvm-svn: 84911	2009-10-22 23:19:17 +00:00
Evan Cheng	007ceb4603	Change createPostRAScheduler so it can be turned off at llc -O1. llvm-svn: 84273	2009-10-16 21:06:15 +00:00
David Goodwin	17199b56b0	Remove -post-RA-schedule flag and add a TargetSubtarget method to enable post-register-allocation scheduling. By default it is off. For ARM, enable/disable with -mattr=+/-postrasched. Enable by default for cortex-a8. llvm-svn: 83122	2009-09-30 00:10:16 +00:00
Evan Cheng	1b38952c99	Reference to hidden symbols do not have to go through non-lazy pointer in non-pic mode. rdar://7187172. llvm-svn: 80904	2009-09-03 07:04:02 +00:00
Evan Cheng	43b9ca6f42	Let Darwin linker auto-synthesize stubs and lazy-pointers. This deletes a bunch of nasty code in ARM asm printer. llvm-svn: 80404	2009-08-28 23:18:09 +00:00
Jim Grosbach	f24f9d9cb6	Whitespace cleanup. Remove trailing whitespace. llvm-svn: 78666	2009-08-11 15:33:49 +00:00
David Goodwin	a307edbdd5	By default, for cortex-a8 use NEON for single-precision FP. llvm-svn: 78200	2009-08-05 16:01:19 +00:00
David Goodwin	3b9c52c5c1	Initial support for single-precision FP using NEON. Added "neonfp" attribute to enable. Added patterns for some binary FP operations. llvm-svn: 78081	2009-08-04 17:53:06 +00:00
Daniel Dunbar	31b44e8f6c	Normalize Subtarget constructors to take a target triple string instead of Module*. Also, dropped uses of TargetMachine where unnecessary. The only target which still takes a TargetMachine& is Mips, I would appreciate it if someone would normalize this to match other targets. llvm-svn: 77918	2009-08-02 22:11:08 +00:00
Evan Cheng	3d8ccdb4be	isThumb2 really should mean thumb2 only, not thumb2+. llvm-svn: 74871	2009-07-06 22:29:14 +00:00
Evan Cheng	2c450d35ae	Change the meaning of predicate hasThumb2 to mean thumb2 ISA is available, not that it's in thumb mode and thumb2 is available. Added isThumb2 predicate to replace the old predicate. llvm-svn: 74692	2009-07-02 06:38:40 +00:00
Bob Wilson	8f74c88cb6	Revert 74164. We'll want to use this method later. llvm-svn: 74176	2009-06-25 16:03:07 +00:00
Bob Wilson	350abb9799	Remove unused hasV6T2Ops method. We already have a separate feature to identify Thumb2. llvm-svn: 74164	2009-06-25 05:20:31 +00:00
Evan Cheng	4e712de541	Latency information for ARM v6. It's rough and not yet hooked up. Right now we are only using branch latency to determine if-conversion limits. llvm-svn: 73747	2009-06-19 01:51:50 +00:00
Evan Cheng	a0ca298f8a	Remove UseThumbBacktraces. Just check if subtarget is darwin. llvm-svn: 73734	2009-06-18 23:14:30 +00:00
Anton Korobeynikov	409105fc95	Rename methods for the sake of consistency. llvm-svn: 73428	2009-06-15 21:46:20 +00:00
Anton Korobeynikov	c82b282b34	Separate V6 from V6T2 since the latter has some extra nice instructions llvm-svn: 73085	2009-06-08 21:20:36 +00:00
Anton Korobeynikov	cd41a9019e	Add helper for checking of Thumb1 mode llvm-svn: 73080	2009-06-08 20:31:02 +00:00
Anton Korobeynikov	12694bd8ac	Implement review feedback. Make thumb2 'normal' subtarget feature llvm-svn: 72698	2009-06-01 20:00:48 +00:00
Anton Korobeynikov	b6f4538683	Add placeholder for thumb2 stuff llvm-svn: 72593	2009-05-29 23:41:08 +00:00
Anton Korobeynikov	0b91cc4260	Add ARMv7 architecture, Cortex processors and different FPU modes handling. llvm-svn: 72337	2009-05-23 19:51:43 +00:00
Anton Korobeynikov	08bf4c0f5a	Propagate CPU string out of SubtargetFeatures llvm-svn: 72335	2009-05-23 19:50:50 +00:00
Dan Gohman	544ab2c50b	Drop ISD::MEMSET, ISD::MEMMOVE, and ISD::MEMCPY, which are not Legal on any current target and aren't optimized in DAGCombiner. Instead of using intermediate nodes, expand the operations, choosing between simple loads/stores, target-specific code, and library calls, immediately. Previously, the code to emit optimized code for these operations was only used at initial SelectionDAG construction time; now it is used at all times. This fixes some cases where rep;movs was being used for small copies where simple loads/stores would be better. This also cleans up code that checks for alignments less than 4; let the targets make that decision instead of doing it in target-independent code. This allows x86 to use rep;movs in low-alignment cases. Also, this fixes a bug that resulted in the use of rep;stos for memsets of 0 with non-constant memory size when the alignment was at least 4. It's better to use the library in this case, which can be significantly faster when the size is large. This also preserves more SourceValue information when memory intrinsics are lowered into simple loads/stores. llvm-svn: 49572	2008-04-12 04:36:06 +00:00
Chris Lattner	f3ebc3f3d2	Remove attribution from file headers, per discussion on llvmdev. llvm-svn: 45418	2007-12-29 20:36:04 +00:00
Rafael Espindola	419b6d7ce4	Make ARM and X86 LowerMEMCPY identical by moving the isThumb check into getMaxInlineSizeThreshold and by restructuring the X86 version. New I just have to move this to a common place :-) llvm-svn: 43554	2007-10-31 14:39:58 +00:00
Rafael Espindola	063f177300	Make ARM an X86 memcpy expansion more similar to each other. Now both subtarget define getMaxInlineSizeThreshold and the expansion uses it. This should not change generated code. llvm-svn: 43552	2007-10-31 11:52:06 +00:00
Evan Cheng	9f8301413c	Added -march=thumb; removed -enable-thumb. llvm-svn: 34521	2007-02-23 03:14:31 +00:00
Lauro Ramos Venancio	048e16ff8f	Add ABI information to ARM subtarget. llvm-svn: 34245	2007-02-13 19:52:28 +00:00
Evan Cheng	181fe36d6c	Introduce TargetType's ELF and Darwin. llvm-svn: 33363	2007-01-19 19:22:40 +00:00
Evan Cheng	10043e215b	ARM backend contribution from Apple. llvm-svn: 33353	2007-01-19 07:51:42 +00:00

1 2 3 4

194 Commits