llvm-project

Commit Graph

Author	SHA1	Message	Date
Bob Wilson	157fec42c9	Use pseudo instructions for 2-register Neon instructions for scalar FP. Partial fix for Radar 8711675. llvm-svn: 121716	2010-12-13 21:05:52 +00:00
Bob Wilson	52f522720e	Remove unused instruction class arguments. llvm-svn: 121715	2010-12-13 21:05:44 +00:00
Bob Wilson	9375d27460	Add float patterns for Neon vld1-lane/dup and vst1-lane operations. llvm-svn: 121583	2010-12-10 22:13:32 +00:00
Bob Wilson	e1d3322111	Remove unused arguments. llvm-svn: 121582	2010-12-10 22:13:24 +00:00
Evan Cheng	62c7b5bf76	Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960	2010-12-05 22:04:16 +00:00
Jim Grosbach	371e586544	Fix copy/pasto in vmin.f32 encoding. llvm-svn: 120709	2010-12-02 16:30:58 +00:00
Owen Anderson	4472801765	Use by-name rather than by-order matching for NEON operands. llvm-svn: 120507	2010-12-01 00:28:25 +00:00
Bob Wilson	318ce7cb3f	Fix the encoding of VLD4-dup alignment. The only reasonable way I could find to do this is to provide an alternate version of the addrmode6 operand with a different encoding function. Use it for all the VLD-dup instructions for the sake of consistency. llvm-svn: 120358	2010-11-30 00:00:42 +00:00
Bob Wilson	0b27b68164	Rename VLDnDUP instructions with double-spaced registers in an attempt to make things a little more consistent. llvm-svn: 120357	2010-11-30 00:00:38 +00:00
Bob Wilson	431ac4ef50	Add support for NEON VLD3-dup instructions. The encoding for alignment in VLD4-dup instructions is still a work in progress. llvm-svn: 120356	2010-11-30 00:00:35 +00:00
Bob Wilson	77ab165afe	Add support for NEON VLD3-dup instructions. llvm-svn: 120312	2010-11-29 19:35:29 +00:00
Bob Wilson	2d790df105	Add support for NEON VLD2-dup instructions. llvm-svn: 120236	2010-11-28 06:51:26 +00:00
Bob Wilson	04b2c94205	Another minor refactoring for VLD1DUP instructions. The op11_8 field is the same for all of them so put it in the instruction classes instead of specifying it separately for each instruction. llvm-svn: 120234	2010-11-28 06:51:15 +00:00
Bob Wilson	d74cf2c8f6	Refactor. Set alignment bit in VLD1-dup instruction classes. llvm-svn: 120197	2010-11-27 07:12:02 +00:00
Bob Wilson	c92eea0175	Add NEON VLD1-dup instructions (load 1 element to all lanes). llvm-svn: 120194	2010-11-27 06:35:16 +00:00
Owen Anderson	7e484e0be7	Use by-name rather than by-order operand matching for some NEON encodings. llvm-svn: 119923	2010-11-21 06:47:06 +00:00
Owen Anderson	b4fd2c90e9	The Vm and Vn register fields must be the same for a register-register vmov. llvm-svn: 119867	2010-11-19 23:12:43 +00:00
Jim Grosbach	785952e5ac	Operand names llvm-svn: 119864	2010-11-19 22:43:08 +00:00
Jim Grosbach	7d8df3185f	Clarify operand names. llvm-svn: 119858	2010-11-19 22:36:02 +00:00
Jim Grosbach	9c335bf977	Remove trailing whitespace. llvm-svn: 119608	2010-11-18 01:39:50 +00:00
Jim Grosbach	a74c7ccd59	ARM PseudoInst instructions don't need or use an assembler string. Get rid of the operand to the pattern. llvm-svn: 119607	2010-11-18 01:38:26 +00:00
Bill Wendling	a68e3a5397	Encode the multi-load/store instructions with their respective modes ('ia', 'db', 'ib', 'da') instead of having that mode as a separate field in the instruction. It's more convenient for the asm parser and much more readable for humans. <rdar://problem/8654088> llvm-svn: 119310	2010-11-16 01:16:36 +00:00
Owen Anderson	c7baee31ad	Add support for ARM's specialized vector-compare-against-zero instructions. llvm-svn: 118453	2010-11-08 23:21:22 +00:00
Owen Anderson	30c4892ea5	Add codegen and encoding support for the immediate form of vbic. llvm-svn: 118291	2010-11-05 19:27:46 +00:00
Owen Anderson	0747307049	Add support for code generation of the one register with immediate form of vorr. We could be more aggressive about making this work for a larger range of constants, but this seems like a good start. llvm-svn: 118201	2010-11-03 22:44:51 +00:00
Owen Anderson	bb81f80af6	Unlike a lot of NEON instructions, vext isn't _actually_ parameterized by element size. Instead, all of the different element sizes are pseudo instructions that map down to vext.8 underneath, with the immediate shifted left to reflect the increased element size. llvm-svn: 118183	2010-11-03 18:16:27 +00:00
Bob Wilson	7d0ac84abd	Add codegen patterns for VST1-lane instructions. Radar 8599955. llvm-svn: 118176	2010-11-03 16:24:53 +00:00
Jim Grosbach	c6af2b4066	Break ARM addrmode4 (load/store multiple base address) into its constituent parts. Represent the operation mode as an optional operand instead. rdar://8614429 llvm-svn: 118137	2010-11-03 01:01:43 +00:00
Owen Anderson	0ebd1fd594	Revert r118097 to fix buildbots. llvm-svn: 118121	2010-11-02 23:47:29 +00:00
Owen Anderson	7c30390277	Since these fields are not exactly equivalent to the encoded field, rename them to something with semantic meaning. llvm-svn: 118097	2010-11-02 22:41:42 +00:00
Owen Anderson	dec87e10fd	Provide correct encodings for the remaining vst variants that we currently generate. llvm-svn: 118087	2010-11-02 22:18:18 +00:00
Owen Anderson	adf88d4c5f	Tentative encodings for the "single element from one lane" variant of vst1. llvm-svn: 118084	2010-11-02 21:54:45 +00:00
Owen Anderson	b95618cfe0	Add correct encodings for basic variants for vst3 and vst4. llvm-svn: 118082	2010-11-02 21:47:03 +00:00
Bob Wilson	d80b29d6f7	Add NEON VST1-lane instructions. Partial fix for Radar 8599955. llvm-svn: 118069	2010-11-02 21:18:25 +00:00
Owen Anderson	fa08e1e277	Add correct encodings for the basic variants for vst2. llvm-svn: 118068	2010-11-02 21:16:58 +00:00
Owen Anderson	87c62e54e6	Add correct encodings for the basic form of vst1. llvm-svn: 118067	2010-11-02 21:06:06 +00:00
Owen Anderson	9f20daf3b4	Factor out a common encoding class for loads and stores with a lane parameter. llvm-svn: 118055	2010-11-02 20:47:39 +00:00
Owen Anderson	a83859539f	Add correct encodings for the rest of the vld instructions that we generate. llvm-svn: 118053	2010-11-02 20:40:59 +00:00
Owen Anderson	526ffd57d2	Add correct NEON encodings for vld2, vld3, and vld4 basic variants. llvm-svn: 117997	2010-11-02 01:24:55 +00:00
Owen Anderson	b3ca2060c0	Attempt to provide correct encodings for a number of other vld1 variants, which we can't test since we can neither generate nor parse them at the moment. llvm-svn: 117988	2010-11-02 00:24:52 +00:00
Owen Anderson	ad40234eff	Add correct NEON encodings for the "multiple single elements" form of vld. llvm-svn: 117984	2010-11-02 00:05:05 +00:00
Bob Wilson	dc44990c7d	Add NEON VLD1-lane instructions. Partial fix for Radar 8599955. llvm-svn: 117964	2010-11-01 22:04:05 +00:00
Owen Anderson	2ef668840a	Add correct NEON encodings for vtbl and vtbx. llvm-svn: 117513	2010-10-28 00:18:46 +00:00
Owen Anderson	14be930317	Add correct NEON encodings for vext, vtrn, vuzp, and vzip. llvm-svn: 117512	2010-10-27 23:56:39 +00:00
Owen Anderson	fadb951e5b	Provide correct encodings for NEON vcvt, which has its own special immediate encoding for specifying fractional bits for fixed point conversions. llvm-svn: 117501	2010-10-27 22:49:00 +00:00
Owen Anderson	ed9652f959	Provide correct encodings for the get_lane and set_lane variants of vmov. llvm-svn: 117495	2010-10-27 21:28:09 +00:00
Owen Anderson	40d24a4abf	Provide correct NEON encodings for vdup. llvm-svn: 117475	2010-10-27 19:25:54 +00:00
Owen Anderson	8576a42cf3	Add correct NEON encodings for vsli and vsri. llvm-svn: 117459	2010-10-27 17:40:08 +00:00
Owen Anderson	d7e8135e1e	Add correct NEON encodings for vsra and vrsra. llvm-svn: 117458	2010-10-27 17:29:29 +00:00
Owen Anderson	825b2d1946	Add correct NEON encodings for vqshl, vqshrn, vqshrun, vqrshl, vqshrn, and vqrshrun. llvm-svn: 117411	2010-10-26 22:50:46 +00:00
Owen Anderson	2888e2c7f9	Correct NEON encodings for vshrn, vrshl, vrshr, vrshrn. llvm-svn: 117402	2010-10-26 21:58:41 +00:00
Owen Anderson	e18579976f	Simplify classes for shift instructions, which are never commutable. llvm-svn: 117398	2010-10-26 21:13:59 +00:00
Owen Anderson	3665fee8de	Provide correct NEON encodings for vshl, register and immediate forms. llvm-svn: 117394	2010-10-26 20:56:57 +00:00
Owen Anderson	691ce68d3c	Add correct NEON encoding for vpadal. llvm-svn: 117380	2010-10-26 18:18:03 +00:00
Owen Anderson	284cb361d1	Add NEON encodings for vmov and vmvn of immediates. llvm-svn: 117374	2010-10-26 17:40:54 +00:00
Owen Anderson	1f6aad053d	Add correct encodings for NEON vabal. llvm-svn: 117315	2010-10-25 21:29:04 +00:00
Owen Anderson	b9c91679aa	Add correct NEON encodings for vaba. llvm-svn: 117309	2010-10-25 20:52:57 +00:00
Owen Anderson	dd001b89d7	Attempt to provide correct encodings for NEON vbit and vbif, even though we can't test them at the moment. llvm-svn: 117294	2010-10-25 20:17:22 +00:00
Owen Anderson	dea09c7564	Provide correct NEON encodings for vbsl. llvm-svn: 117293	2010-10-25 20:13:13 +00:00
Owen Anderson	2477446ee5	Add correct instruction encodings for vbic, vorn, and vmvn. llvm-svn: 117282	2010-10-25 18:43:52 +00:00
Owen Anderson	feb3ee0c93	Add NEON encoding tests for vcgt and vacgt. llvm-svn: 117276	2010-10-25 18:03:59 +00:00
Owen Anderson	e5d0677173	Add tests for NEON encodings of vcge and vacge. llvm-svn: 117274	2010-10-25 17:49:32 +00:00
Owen Anderson	c178b80f65	Add a warning about our inability to test the encoding of vceq with immediate zero. llvm-svn: 117273	2010-10-25 17:33:02 +00:00
Owen Anderson	9d0122af7d	Add correct NEON encodings for vqdmlal. llvm-svn: 117134	2010-10-22 19:35:48 +00:00
Owen Anderson	3d0264667f	Provide correct encodings for NEON vmlal. llvm-svn: 117131	2010-10-22 19:05:25 +00:00
Owen Anderson	f48719f1b5	Provide correct NEON encodings for vmla. llvm-svn: 117126	2010-10-22 18:54:37 +00:00
Owen Anderson	9e44cf2bb2	ARM encodes Q registers as 2xregno (i.e. the number of the D register that corresponds to the lower half of the Q register), rather than with just regno. This allows us to unify the encodings for a lot of different NEON instrucitons that differ only in whether they have Q or D register operands. llvm-svn: 117056	2010-10-21 20:21:49 +00:00
Owen Anderson	6b7e401049	Add correct NEON encodings for vhadd and vrhadd. llvm-svn: 117047	2010-10-21 18:55:04 +00:00
Owen Anderson	9561084188	Add correct encodings for NEON vaddw.s* and vaddw.u*. llvm-svn: 117040	2010-10-21 18:20:25 +00:00
Owen Anderson	15c97706e8	Provide correct NEON encodings for vaddl.u* and vaddl.s*. llvm-svn: 117039	2010-10-21 18:09:17 +00:00
Owen Anderson	6083502848	Implement correct encodings for NEON vadd, both integer and floating point. llvm-svn: 116981	2010-10-21 00:48:00 +00:00
Jim Grosbach	340cd5174b	A few 80 column fixes. llvm-svn: 116451	2010-10-13 23:34:31 +00:00
Evan Cheng	e790afcbe1	More ARM scheduling itinerary fixes. llvm-svn: 116266	2010-10-11 23:41:41 +00:00
Evan Cheng	94ad008beb	Proper VST scheduling itineraries. llvm-svn: 116251	2010-10-11 22:03:18 +00:00
Evan Cheng	d7a404d85f	Add VLD4 scheduling itineraries. llvm-svn: 116143	2010-10-09 04:07:58 +00:00
Evan Cheng	a762400bed	Finish vld3 and vld4. llvm-svn: 116140	2010-10-09 01:45:34 +00:00
Evan Cheng	05f13e94bf	Correct some load / store instruction itinerary mistakes: 1. Cortex-A8 load / store multiplies can only issue on ALU0. 2. Eliminate A8_Issue, A8_LSPipe will correctly limit the load / store issues. 3. Correctly model all vld1 and vld2 variants. llvm-svn: 116134	2010-10-09 01:03:04 +00:00
Evan Cheng	1958cefd69	Model operand cycles of vldm / vstm; also fixes scheduling itineraries of vldr / vstr, etc. llvm-svn: 115898	2010-10-07 01:50:48 +00:00
Jim Grosbach	2e3e2a006b	Change the NEON VDUPfdf and VDUPfqf pseudo-instructions to actually be pseudo instructions. llvm-svn: 115840	2010-10-06 21:16:16 +00:00
Jim Grosbach	233b3a2f95	Add a 'pattern' arg to the ARM PseudoNeonI class. llvm-svn: 115831	2010-10-06 20:36:55 +00:00
Jim Grosbach	fae8305e2b	Nuke the rest of the :comment references llvm-svn: 115373	2010-10-01 23:21:38 +00:00
Evan Cheng	1969887fc6	Fix scheduling infor for vmovn and vshrn which I broke accidentially. llvm-svn: 115354	2010-10-01 21:48:06 +00:00
Evan Cheng	2a5d764858	NEON scheduling info fix. vmov reg, reg are single cycle instructions. llvm-svn: 115344	2010-10-01 20:50:58 +00:00
Bob Wilson	6b853c3ce3	Change VLDMQ and VSTMQ to be pseudo instructions. They are expanded after register allocation to VLDMD and VSTMD respectively. This avoids using the dregpair operand modifier. llvm-svn: 114047	2010-09-16 00:31:02 +00:00
Bob Wilson	b1e9d4bff1	Use VLD1/VST1 pseudo instructions for loadRegFromStackSlot and storeRegToStackSlot. llvm-svn: 113918	2010-09-15 01:48:05 +00:00
Jim Grosbach	c7cf42d80b	Reapply r113875 with additional cleanups. "The register specified for a dregpair is the corresponding Q register, so to get the pair, we need to look up the sub-regs based on the qreg. Create a lookup function since we don't have access to TargetRegisterInfo here to be able to use getSubReg(ARM::dsub_[01])." Additionaly, fix the NEON VLD1* and VST1* instruction patterns not to use the dregpair modifier for the 2xdreg versions. Explicitly specifying the two registers as operands is more correct and more consistent with the other instruction patterns. This enables further cleanup of special case code in the disassembler as a nice side-effect. llvm-svn: 113903	2010-09-14 23:54:06 +00:00
Bob Wilson	dd29db5635	Make NEON ld/st pseudo instruction classes take the instruction itinerary as an argument, so that we can distinguish instructions with the same register classes but different numbers of registers (e.g., vld3 and vld4). Fix some of the non-pseudo NEON ld/st instruction itineraries to reflect the number of registers loaded or stored, not just the opcode name. llvm-svn: 113854	2010-09-14 20:59:49 +00:00
Bob Wilson	c597fd3b4a	Convert some VTBL and VTBX instructions to use pseudo instructions prior to register allocation. Remove the NEONPreAllocPass, which is no longer needed. Yeah!! llvm-svn: 113818	2010-09-13 23:55:10 +00:00
Bob Wilson	d5c57a5ed4	Switch all the NEON vld-lane and vst-lane instructions over to the new pseudo-instruction approach. Change ARMExpandPseudoInsts to use a table to record all the NEON load/store information. llvm-svn: 113812	2010-09-13 23:01:35 +00:00
Bob Wilson	4adbaf1843	Fix NEON VLD pseudo instruction itineraries that were incorrectly copied from the VST pseudos. The VLD/VST scheduling still needs work (see pr6722), but at least we shouldn't confuse the loads with the stores. llvm-svn: 113473	2010-09-09 05:40:26 +00:00
Jim Grosbach	abcbe2474d	VFP/NEON load/store multiple instructions are addrmode4, not 5. llvm-svn: 113322	2010-09-08 00:25:50 +00:00
Bob Wilson	35fafca587	Finish converting the rest of the NEON VLD instructions to use pseudo- instructions prior to regalloc. Since it's getting a little close to the 2.8 branch deadline, I'll have to leave the rest of the instructions handled by the NEONPreAllocPass for now, but I didn't want to leave half of the VLD instructions converted and the other half not. llvm-svn: 112983	2010-09-03 18:16:02 +00:00
Bob Wilson	f65c9ef720	Replace NEON vabdl, vaba, and vabal intrinsics with combinations of the vabd intrinsic and add and/or zext operations. In the case of vaba, this also avoids the need for a DAG combine pattern to combine vabd with add. Update tests. Auto-upgrade the old intrinsics. llvm-svn: 112941	2010-09-03 01:35:08 +00:00
Bob Wilson	75a6408f88	Convert VLD1 and VLD2 instructions to use pseudo-instructions until after regalloc. llvm-svn: 112825	2010-09-02 16:00:54 +00:00
Bob Wilson	38ab35a911	Remove NEON vmull, vmlal, and vmlsl intrinsics, replacing them with multiply, add, and subtract operations with zero-extended or sign-extended vectors. Update tests. Add auto-upgrade support for the old intrinsics. llvm-svn: 112773	2010-09-01 23:50:19 +00:00
Bob Wilson	4cd8a126c3	Remove NEON vmovn intrinsic, replacing it with vector truncate operations. Auto-upgrade the old intrinsic and update tests. llvm-svn: 112507	2010-08-30 20:02:30 +00:00
Bob Wilson	d0c054886c	Remove NEON vaddl, vaddw, vsubl, and vsubw intrinsics. Instead, use llvm IR add/sub operations with one or both operands sign- or zero-extended. Auto-upgrade the old intrinsics. llvm-svn: 112416	2010-08-29 05:57:34 +00:00
Bob Wilson	950882be07	Use pseudo instructions for VST1 and VST2. llvm-svn: 112357	2010-08-28 05:12:57 +00:00
Bob Wilson	8ee9394750	We don't need to custom-select VLDMQ and VSTMQ anymore. llvm-svn: 112336	2010-08-28 00:20:11 +00:00
Bob Wilson	13ce07fa92	Change ARM VFP VLDM/VSTM instructions to use addressing mode #4 , just like all the other LDM/STM instructions. This fixes asm printer crashes when compiling with -O0. I've changed one of the NEON tests (vst3.ll) to run with -O0 to check this in the future. Prior to this change VLDM/VSTM used addressing mode #5, but not really. The offset field was used to hold a count of the number of registers being loaded or stored, and the AM5 opcode field was expanded to specify the IA or DB mode, instead of the standard ADD/SUB specifier. Much of the backend was not aware of these special cases. The crashes occured when rewriting a frameindex caused the AM5 offset field to be changed so that it did not have a valid submode. I don't know exactly what changed to expose this now. Maybe we've never done much with -O0 and NEON. Regardless, there's no longer any reason to keep a count of the VLDM/VSTM registers, so we can use addressing mode #4 and clean things up in a lot of places. llvm-svn: 112322	2010-08-27 23:18:17 +00:00

1 2 3 4 5 ...

356 Commits