llvm-project

Commit Graph

Author	SHA1	Message	Date
Bill Schmidt	92e26646bc	Fix PR15632: No support for ppcf128 floating-point remainder on PowerPC. For this we need to use a libcall. Previously LLVM didn't implement libcall support for frem, so I've added it in the usual straightforward manner. A test case from the bug report is included. llvm-svn: 178639	2013-04-03 13:05:44 +00:00
Tim Northover	5816ca117b	AArch64: implement ETMv4 trace system registers. llvm-svn: 178637	2013-04-03 12:31:29 +00:00
Aaron Ballman	5f7c680fdc	Second pass at addressing PR15351 by explicitly checking for AVX support when getting the host processor information. It emits a .byte sequence on GNUC compilers to work around lack of xgetbv support with older assemblers, and resolves a comment typo found in the previous patch. llvm-svn: 178636	2013-04-03 12:25:06 +00:00
Timur Iskhodzhanov	f4e0665e56	Fix SRet for thiscall in i686-pc-win32 llvm-svn: 178634	2013-04-03 11:27:54 +00:00
Tim Northover	5b097a735f	AArch64: switch patterns to be type-based rather than RegClass-based It's a bit of churn in the blame log, but I think there are real benefits to the newer system so I'm making the change in one go. llvm-svn: 178633	2013-04-03 11:19:16 +00:00
Eric Christopher	14c2067ca1	Fix grammar. llvm-svn: 178624	2013-04-03 05:29:58 +00:00
Eric Christopher	5590949f29	Remove ZeroOrMore from the option description. We don't need it here. llvm-svn: 178623	2013-04-03 05:26:07 +00:00
Jakob Stoklund Olesen	d9bbdfd3cc	Add 64-bit compare + branch for SPARC v9. The same compare instruction is used for 32-bit and 64-bit compares. It sets two different sets of flags: icc and xcc. This patch adds a conditional branch instruction using the xcc flags for 64-bit compares. llvm-svn: 178621	2013-04-03 04:41:44 +00:00
Hal Finkel	b00fc87608	Remove some unsupported-feature comments from PPC.td These refer to the reciprocal estimate support recently committed. llvm-svn: 178618	2013-04-03 04:03:58 +00:00
Hal Finkel	2e10331057	Use PPC reciprocal estimates with Newton iteration in fast-math mode When unsafe FP math operations are enabled, we can use the fre[s] and frsqrte[s] instructions, which generate reciprocal (sqrt) estimates, together with some Newton iteration, in order to quickly generate floating-point division and sqrt results. All of these instructions are separately optional, and so each has its own feature flag (except for the Altivec instructions, which are covered under the existing Altivec flag). Doing this is not only faster than using the IEEE-compliant fdiv/fsqrt instructions, but allows these computations to be pipelined with other computations in order to hide their overall latency. I've also added a couple of missing fnmsub patterns which turned out to be missing (but are necessary for good code generation of the Newton iterations). Altivec needs a similar fix, but that will probably be more complicated because fneg is expanded for Altivec's v4f32. llvm-svn: 178617	2013-04-03 04:01:11 +00:00
Rafael Espindola	b9b7ae0c78	Fix the fde encoding used by mips to match gas. This finally fixes the encoding. The patch also * Removes eh-frame.ll. It was an unnecessary .ll to .o test that was checking the wrong value. * Merge fde-reloc.s and eh-frame.s into a single test, since the only difference was the run lines. * Don't blindly test the content of the entire .eh_frame section. It makes it hard to anyone actually fixing a bug and hitting a difference in a binary blob. Instead, use a CHECK for each field and document what is being checked. llvm-svn: 178615	2013-04-03 03:13:19 +00:00
Aaron Ballman	9c0f0af54f	Rolling back the AVX support patch due to breaking a gcc 4.6 build bot that doesn't understand the xgetbv instruction for some reason. Will revisit when time permits. llvm-svn: 178614	2013-04-03 03:11:39 +00:00
Michael Gottesman	b8c8836594	Remove an optimization where we were changing an objc_autorelease into an objc_autoreleaseReturnValue. The semantics of ARC implies that a pointer passed into an objc_autorelease must live until some point (potentially down the stack) where an autorelease pool is popped. On the other hand, an objc_autoreleaseReturnValue just signifies that the object must live until the end of the given function at least. Thus objc_autorelease is stronger than objc_autoreleaseReturnValue in terms of the semantics of ARC* implying that performing the given strength reduction without any knowledge of how this relates to the autorelease pool pop that is further up the stack violates the semantics of ARC. *Even though objc_autoreleaseReturnValue if you know that no RV optimization will occur is more computationally expensive. llvm-svn: 178612	2013-04-03 02:57:24 +00:00
Michael Gottesman	624243914f	Improved comment. No functionality change. llvm-svn: 178605	2013-04-03 01:57:16 +00:00
Aaron Ballman	56be6ba5e4	Attempting to fix the build on older GCC versions. llvm-svn: 178604	2013-04-03 01:39:37 +00:00
Aaron Ballman	6bc0dfc7bd	This patch addresses PR15351 by explicitly checking for AVX support when getting the host processor information. llvm-svn: 178598	2013-04-03 00:33:32 +00:00
Eric Christopher	e2fbc67e81	Formatting. llvm-svn: 178589	2013-04-02 23:06:40 +00:00
Akira Hatanaka	023c678a0d	[mips] Small update to the implementation of eh.return for Mips. This patch initializes t9 to the handler address, but only if the relocation model is pic. This handles the case where handler to which eh.return jumps points to the start of the function. Patch by Sasa Stankovic. llvm-svn: 178588	2013-04-02 23:02:07 +00:00
Eric Christopher	6476f908b3	Support and test template arguments for unions. llvm-svn: 178586	2013-04-02 22:55:56 +00:00
Eric Christopher	17dd8f07c6	Reformat arguments. llvm-svn: 178585	2013-04-02 22:55:52 +00:00
Akira Hatanaka	2ffc5734e7	[mips] Expand pseudo multiply/divide instructions in MipsCodeEmitter.cpp. This patch fixes the following two tests which have been failing on llvm-mips-linux builder since r178403: LLVM :: Analysis/Profiling/load-branch-weights-ifs.ll LLVM :: Analysis/Profiling/load-branch-weights-loops.ll llvm-svn: 178584	2013-04-02 22:53:58 +00:00
Jakob Stoklund Olesen	aeb69a5481	Allow MachineTraceMetrics to be used when the model has no resources. It it still possible to extract information from itineraries, for example. llvm-svn: 178582	2013-04-02 22:27:45 +00:00
Chad Rosier	8a24466f69	[ms-inline asm] Add support for parsing variables with namespace alias qualifiers. This patch only adds support for parsing these identifiers in the X86AsmParser. The front-end interface isn't capable of looking up these identifiers at this point in time. The end result is the compiler now errors during object file emission, rather than at parse time. Test case coming shortly. Part of rdar://13499009 and PR13340 llvm-svn: 178566	2013-04-02 20:02:33 +00:00
Bill Schmidt	3581cd4b4c	Fix PR15630: Replace faulty stdcx. with stwcx. When doing a partword atomic operation, a lwarx was being paired with a stdcx. instead of a stwcx. when compiling for a 64-bit target. The target has nothing to do with it in this case; we always need a stwcx. Thanks to Kai Nacke for reporting the problem. llvm-svn: 178559	2013-04-02 18:37:08 +00:00
Jakob Stoklund Olesen	8fbfc59164	Don't attempt MTM heuristics without a scheduling model present. This should fix the PPC buildbots. llvm-svn: 178558	2013-04-02 18:26:45 +00:00
Jakob Stoklund Olesen	3ca14772d0	Count processor resources individually in MachineTraceMetrics. The new instruction scheduling models provide information about the number of cycles consumed on each processor resource. This makes it possible to estimate ILP more accurately than simply counting instructions / issue width. The functions getResourceDepth() and getResourceLength() now identify the limiting processor resource, and return a cycle count based on that. This gives more precise resource information, particularly in traces that use one resource a lot more than others. llvm-svn: 178553	2013-04-02 17:49:51 +00:00
Chad Rosier	7925d280ff	[fast-isel] Use the correct API to disable FastLowerArguments for Win64. llvm-svn: 178549	2013-04-02 16:31:41 +00:00
Arnold Schwaighofer	d6c6e868b2	DAGCombiner: Merge store/loads when we have extload/truncstores This is helps on architectures where i8,i16 are not legal but we have byte, and short loads/stores. Allowing us to merge copies like the one below on ARM. copy(char a, char b, int n) { do { int t0 = a[0]; int t1 = a[1]; b[0] = t0; b[1] = t1; radar://13536387 llvm-svn: 178546	2013-04-02 15:58:51 +00:00
Justin Holewinski	a922c7e90e	[NVPTX] Fix a few style issues in NVVMReflect llvm-svn: 178536	2013-04-02 12:37:11 +00:00
Bill Wendling	88d06c3b2d	Use a worklist to avoid a sneaky iterator invalidation. The iterator could be invalidated when it's recursively deleting a whole bunch of constant expressions in a constant initializer. Note: This was only reproducible if `opt' was run on a `.bc' file. If `opt' was run on a `.ll' file, it wouldn't crash. This is why the test first pushes the `.ll' file through `llvm-as' before feeding it to `opt'. PR15440 llvm-svn: 178531	2013-04-02 08:16:45 +00:00
Jakob Stoklund Olesen	8eabc3ffde	Add 64-bit load and store instructions. There is only a few new instructions, the rest is handled with patterns. llvm-svn: 178528	2013-04-02 04:09:28 +00:00
Jakob Stoklund Olesen	917e07f095	Basic 64-bit ALU operations. SPARC v9 extends all ALU instructions to 64 bits, so we simply need to add patterns to use them for both i32 and i64 values. llvm-svn: 178527	2013-04-02 04:09:23 +00:00
Jakob Stoklund Olesen	bddb20eeef	Materialize 64-bit immediates. The last resort pattern produces 6 instructions, and there are still opportunities for materializing some immediates in fewer instructions. llvm-svn: 178526	2013-04-02 04:09:17 +00:00
Jakob Stoklund Olesen	c1d1a4816e	Add 64-bit shift instructions. SPARC v9 defines new 64-bit shift instructions. The 32-bit shift right instructions are still usable as zero and sign extensions. This adds new F3_Sr and F3_Si instruction formats that probably should be used for the 32-bit shifts as well. They don't really encode an simm13 field. llvm-svn: 178525	2013-04-02 04:09:12 +00:00
Jakob Stoklund Olesen	739d722ef7	Add predicates for distinguishing 32-bit and 64-bit modes. The 'sparc' architecture produces 32-bit code while 'sparcv9' produces 64-bit code. It is also possible to run 32-bit code using SPARC v9 instructions with: llc -march=sparc -mattr=+v9 llvm-svn: 178524	2013-04-02 04:09:06 +00:00
Jakob Stoklund Olesen	0b21f35aca	Add support for 64-bit calling convention. This is far from complete, but it is enough to make it possible to write test cases using i64 arguments. Missing features: - Floating point arguments. - Receiving arguments on the stack. - Calls. llvm-svn: 178523	2013-04-02 04:09:02 +00:00
Jakob Stoklund Olesen	5ad3b35377	Add an I64Regs register class for 64-bit registers. We are going to use the same registers for 32-bit and 64-bit values, but in two different register classes. The I64Regs register class has a larger spill size and alignment. The addition of an i64 register class confuses TableGen's type inference, so it is necessary to clarify the type of some immediates and the G0 register. In 64-bit mode, pointers are i64 and should use the I64Regs register class. Implement getPointerRegClass() to dynamically provide the pointer register class depending on the subtarget. Use ptr_rc and iPTR for memory operands. Finally, add the i64 type to the IntRegs register class. This register class is not used to hold i64 values, I64Regs is for that. The type is required to appease TableGen's type checking in output patterns like this: def : Pat<(add i64:$a, i64:$b), (ADDrr $a, $b)>; SPARC v9 uses the same ADDrr instruction for i32 and i64 additions, and TableGen doesn't know to check the type of register sub-classes. llvm-svn: 178522	2013-04-02 04:08:54 +00:00
Hal Finkel	93d75ea08a	Fix typo in PPCISelLowering Thanks to Bill Schmidt for finding this in review of r178480. llvm-svn: 178521	2013-04-02 03:29:51 +00:00
Andrew Trick	e1d88cfb57	The divide unit is not pipeline, but it is still buffered. Buffered means a later divide may be executed out-of-order while a prior divide is sitting (buffered) in a reservation station. You can tell it's not pipelined, because operations that use it reserve it for more than one cycle: def : WriteRes<WriteIDiv, [HWPort0, HWDivider]> { let Latency = 25; let ResourceCycles = [1, 10]; } We don't currently distinguish between an unpipeline operation and one that is split into multiple micro-ops requiring the same unit. Except that the later may have NumMicroOps > 1 if they also consume issue/dispatch resources. llvm-svn: 178519	2013-04-02 01:58:47 +00:00
NAKAMURA Takumi	fd98f7f2b6	Target/R600: Fix CMake build to add missing files. llvm-svn: 178508	2013-04-01 22:05:58 +00:00
Jack Carter	9423f507b1	Mips direct object exception handling regression Revision 177141 caused a regression in all but mips64 little endian. That is because none of the other Mips targets had test cases checking the contents of the .eh_frame section. This patch fixes both the llvm code and adds an assembler test case to include the current 4 flavors. The test cases unfortunately rely on llvm-objdump. A preferable method would be to use a pretty printer output such as what readelf -wf <elf_file> would give. I also changed the name of the test case to correct a typo. llvm-svn: 178506	2013-04-01 21:55:15 +00:00
Vincent Lejeune	bfaa63a6db	R600: Add support for native control flow llvm-svn: 178505	2013-04-01 21:48:05 +00:00
Vincent Lejeune	ace6f7351e	R600/SI: Share code recording ShaderTypeAttribute between generations llvm-svn: 178504	2013-04-01 21:47:53 +00:00
Vincent Lejeune	f43bc57b66	R600: Emit CF_ALU and use true kcache register. llvm-svn: 178503	2013-04-01 21:47:42 +00:00
Eli Bendersky	e60fc2f676	Fix top-comment header and some indentation llvm-svn: 178492	2013-04-01 19:47:56 +00:00
Hal Finkel	3f88d08974	Fix a bad assert in PPCTargetLowering llvm-svn: 178489	2013-04-01 18:42:58 +00:00
Shuxin Yang	6662fd0f15	Correct assertion condition llvm-svn: 178484	2013-04-01 18:13:05 +00:00
Arnold Schwaighofer	6752366ed7	Merge load/store sequences with adresses: base + index + offset We would also like to merge sequences that involve a variable index like in the example below. int index = *idx++ int i0 = c[index+0]; int i1 = c[index+1]; b[0] = i0; b[1] = i1; By extending the parsing of the base pointer to handle dags that contain a base, index, and offset we can handle examples like the one above. The dag for the code above will look something like: (load (i64 add (i64 copyfromreg %c) (i64 signextend (i8 load %index)))) (load (i64 add (i64 copyfromreg %c) (i64 signextend (i32 add (i32 signextend (i8 load %index)) (i32 1))))) The code that parses the tree ignores the intermediate sign extensions. However, if there is a sign extension it needs to be on all indexes. (load (i64 add (i64 copyfromreg %c) (i64 signextend (add (i8 load %index) (i8 1)))) vs (load (i64 add (i64 copyfromreg %c) (i64 signextend (i32 add (i32 signextend (i8 load %index)) (i32 1))))) radar://13536387 llvm-svn: 178483	2013-04-01 18:12:58 +00:00
Hal Finkel	f6d45f2379	Add more PPC floating-point conversion instructions The P7 and A2 have additional floating-point conversion instructions which allow a direct two-instruction sequence (plus load/store) to convert from all combinations (signed/unsigned i32/i64) <--> (float/double) (on previous cores, only some combinations were directly available). llvm-svn: 178480	2013-04-01 17:52:07 +00:00
Hal Finkel	39caf9f5ec	Use ImmToIdxMap.count in PPCRegisterInfo Code improvement suggested by Jakob (in review of r178450). No functionality change intended. llvm-svn: 178473	2013-04-01 17:02:06 +00:00

1 2 3 4 5 ...

60383 Commits