llvm-project

Commit Graph

Author	SHA1	Message	Date
Ulrich Weigand	aa0ac4f11c	[PowerPC] ELFv2 function call changes This patch builds upon the two preceding MC changes to implement the basic ELFv2 function call convention. In the ELFv1 ABI, a "function descriptor" was associated with every function, pointing to both the entry address and the related TOC base (and a static chain pointer for nested functions). Function pointers would actually refer to that descriptor, and the indirect call sequence needed to load up both entry address and TOC base. In the ELFv2 ABI, there are no more function descriptors, and function pointers simply refer to the (global) entry point of the function code. Indirect function calls simply branch to that address, after loading it up into r12 (as required by the ABI rules for a global entry point). Direct function calls continue to just do a "bl" to the target symbol; this will be resolved by the linker to the local entry point of the target function if it is local, and to a PLT stub if it is global. That PLT stub would then load the (global) entry point address of the final target into r12 and branch to it. Note that when performing a local function call, r2 must be set up to point to the current TOC base: if the target ends up local, the ABI requires that its local entry point is called with r2 set up; if the target ends up global, the PLT stub requires that r2 is set up. This patch implements all LLVM changes to implement that scheme: - No longer create a function descriptor when emitting a function definition (in EmitFunctionEntryLabel) - Emit two entry points if the function needs the TOC base (r2) anywhere (this is done EmitFunctionBodyStart; note that this cannot be done in EmitFunctionBodyStart because the global entry point prologue code must be part of the function as covered by debug info). - In order to make use tracking of r2 (as needed above) work correctly, mark direct function calls as implicitly using r2. - Implement the ELFv2 indirect function call sequence (no function descriptors; load target address into r12). - When creating an ELFv2 object file, emit the .abiversion 2 directive to tell the linker to create the appropriate version of PLT stubs. Reviewed by Hal Finkel. llvm-svn: 213489	2014-07-20 23:31:44 +00:00
Hal Finkel	07c9bb3d87	[LoopVectorize] Remove an unused private AA pointer Thanks to the lld-x86_64-darwin13 builder for catching this first. llvm-svn: 213488	2014-07-20 23:28:25 +00:00
Ulrich Weigand	46797c6960	[MC] Pass MCSymbolData to needsRelocateWithSymbol As discussed in a previous checking to support the .localentry directive on PowerPC, we need to inspect the actual target symbol in needsRelocateWithSymbol to make the appropriate decision based on that symbol's st_other bits. Currently, needsRelocateWithSymbol does not get the target symbol. However, it is directly available to its sole caller. This patch therefore simply extends the needsRelocateWithSymbol by a new parameter "const MCSymbolData &SD", passes in the target symbol, and updates all derived implementations. In particular, in the PowerPC implementation, this patch removes the FIXME added by the previous checkin. llvm-svn: 213487	2014-07-20 23:15:06 +00:00
Hal Finkel	7ae00a1282	[LoopVectorize] Use AA to partition potential dependency checks Prior to this change, the loop vectorizer did not make use of the alias analysis infrastructure. Instead, it performed memory dependence analysis using ScalarEvolution-based linear dependence checks within equivalence classes derived from the results of ValueTracking's GetUnderlyingObjects. Unfortunately, this meant that: 1. The loop vectorizer had logic that essentially duplicated that in BasicAA for aliasing based on identified objects. 2. The loop vectorizer could not partition the space of dependency checks based on information only easily available from within AA (TBAA metadata is currently the prime example). This means, for example, regardless of whether -fno-strict-aliasing was provided, the vectorizer would only vectorize this loop with a runtime memory-overlap check: void foo(int a, float b) { for (int i = 0; i < 1600; ++i) a[i] = b[i]; } This is suboptimal because the TBAA metadata already provides the information necessary to show that this check unnecessary. Of course, the vectorizer has a limit on the number of such checks it will insert, so in practice, ignoring TBAA means not vectorizing more-complicated loops that we should. This change causes the vectorizer to use an AliasSetTracker to keep track of the pointers in the loop. The resulting alias sets are then used to partition the space of dependency checks, and potential runtime checks; this results in more-efficient vectorizations. When pointer locations are added to the AliasSetTracker, two things are done: 1. The location size is set to UnknownSize (otherwise you'd not catch inter-iteration dependencies) 2. For instructions in blocks that would need to be predicated, TBAA is removed (because the metadata might have a control dependency on the condition being speculated). For non-predicated blocks, you can leave the TBAA metadata. This is safe because you can't have an iteration dependency on the TBAA metadata (if you did, and you unrolled sufficiently, you'd end up with the same pointer value used by two accesses that TBAA says should not alias, and that would yield undefined behavior). llvm-svn: 213486	2014-07-20 23:07:52 +00:00
Ulrich Weigand	bb68610dc9	[PowerPC] ELFv2 MC support for .localentry directive A second binutils feature needed to support ELFv2 is the .localentry directive. In the ELFv2 ABI, functions may have two entry points: one for calling the routine locally via "bl", and one for calling the function via function pointer (either at the source level, or implicitly via a PLT stub for global calls). The two entry points share a single ELF symbol, where the ELF symbol address identifies the global entry point address, while the local entry point is found by adding a delta offset to the symbol address. That offset is encoded into three platform-specific bits of the ELF symbol st_other field. The .localentry directive instructs the assembler to set those fields to encode a particular offset. This is typically used by a function prologue sequence like this: func: addis r2, r12, (.TOC.-func)@ha addi r2, r2, (.TOC.-func)@l .localentry func, .-func Note that according to the ABI, when calling the global entry point, r12 must be set to point the global entry point address itself; while when calling the local entry point, r2 must be set to point to the TOC base. The two instructions between the global and local entry point in the above example translate the first requirement into the second. This patch implements support in the PowerPC MC streamers to emit the .localentry directive (both into assembler and ELF object output), as well as support in the assembler parser to parse that directive. In addition, there is another change required in MC fixup/relocation handling to properly deal with relocations targeting function symbols with two entry points: When the target function is known local, the MC layer would immediately handle the fixup by inserting the target address -- this is wrong, since the call may need to go to the local entry point instead. The GNU assembler handles this case by not directly resolving fixups targeting functions with two entry points, but always emits the relocation and relies on the linker to handle this case correctly. This patch changes LLVM MC to do the same (this is done via the processFixupValue routine). Similarly, there are cases where the assembler would normally emit a relocation, but "simplify" it to a relocation targeting a section instead of the actual symbol. For the same reason as above, this may be wrong when the target symbol has two entry points. The GNU assembler again handles this case by not performing this simplification in that case, but leaving the relocation targeting the full symbol, which is then resolved by the linker. This patch changes LLVM MC to do the same (via the needsRelocateWithSymbol routine). NOTE: The method used in this patch is overly pessimistic, since the needsRelocateWithSymbol routine currently does not have access to the actual target symbol, and thus must always assume that it might have two entry points. This will be improved upon by a follow-on patch that modifies common code to pass the target symbol when calling needsRelocateWithSymbol. Reviewed by Hal Finkel. llvm-svn: 213485	2014-07-20 23:06:03 +00:00
Ulrich Weigand	0daa5164bf	[PowerPC] ELFv2 MC support for .abiversion directive ELFv2 binaries are marked by a bit in the ELF header e_flags field. A new assembler directive .abiversion can be used to set that flag. This patch implements support in the PowerPC MC streamers to emit the .abiversion directive (both into assembler and ELF binary output), as well as support in the assembler parser to parse the .abiversion directive. Reviewed by Hal Finkel. llvm-svn: 213484	2014-07-20 22:56:57 +00:00
Ulrich Weigand	241959722e	[PowerPC] Refactor byval handling in LowerFormalArguments_64SVR4 When handling an incoming byval argument, we need to possibly write incoming registers to the stack in order to create an on-stack image of the parameter, so we can return its address to common code. This currently uses CreateFixedObject to access the parts of the parameter save area where the argument is (or needs to be) stored. However, sometimes we need to access multiple parts of that area, e.g. to write multiple registers. The code currently uses a new CreateFixedObject call for each of these accesses, resulting in a patchwork of overlapping (fixed) stack objects. This doesn't really matter in the case of fixed objects, since any access to those turns into a fixed stackpointer + offset address anyway. However, with the upcoming ELFv2 patches, we may actually need to place an incoming argument into our own stack frame instead of the caller's. This means we need to use CreateStackObject instead, and we cannot have multiple overlapping instances of those. To make the rest of the argument handling code work equally in both situations, this patch refactors it to always use just a single call to CreateFixedObject, and access parts of that object as required using address arithmetic. This way, we can in a future patch substitute CreateStackObject without further changes. No change to generated code intended. llvm-svn: 213483	2014-07-20 22:36:52 +00:00
Ulrich Weigand	55a96650d9	[PowerPC] Fix FrameIndex handling in SelectAddressRegImm The PPCTargetLowering::SelectAddressRegImm routine needs to handle FrameIndex nodes in a special manner, by tranlating them into a TargetFrameIndex node. This was done in most cases, but seems to have been neglected in one path: when the input tree has an OR of the FrameIndex with an immediate. This can happen if the FrameIndex can be proven to be sufficiently aligned that an OR of that immediate is equivalent to an ADD. The missing handling of FrameIndex in that case caused the SelectionDAG instruction selection to miss opportunities to merge the OR back into the FrameIndex node, leading to superfluous addi/ori instructions in the final assembler output. llvm-svn: 213482	2014-07-20 22:26:40 +00:00
Joerg Sonnenberger	9720fcf4bf	Redo THUMB support. Discussed with and tested by: Saleem Abdulrasool llvm-svn: 213481	2014-07-20 20:53:37 +00:00
Simon Atanasyan	09f45ca39b	[Mips] Replace assembler code by YAML to make the 'dynlib-fileheader.test' test target independent. llvm-svn: 213480	2014-07-20 20:03:46 +00:00
Joerg Sonnenberger	8f6cf7085a	Revert r213467, it breaks non-thumb mode. llvm-svn: 213479	2014-07-20 20:00:26 +00:00
Artyom Skrobov	7d602f7a77	Namespace cleanup (no functional change) llvm-svn: 213478	2014-07-20 12:08:28 +00:00
NAKAMURA Takumi	45e0a83141	SIISelLowering.cpp: Define _USE_MATH_DEFINES to let M_PI provided on MS <cmath>. FIXME: Would it be better to move it into configure? llvm-svn: 213477	2014-07-20 11:15:07 +00:00
NAKAMURA Takumi	74a5332235	MachineRegionInfo.cpp: Another fix on MachineRegionInfo::MachineRegionInfo::recalculate() to appease msc17. llvm-svn: 213476	2014-07-20 11:14:55 +00:00
Manuel Jacob	6d5cc8d656	Remove braces around single-statement block and rangify outer loop. This is a follow-up to r213474. llvm-svn: 213475	2014-07-20 09:20:47 +00:00
Manuel Jacob	d11beffef4	[C++11] Add predecessors(BasicBlock ) / successors(BasicBlock ) iterator ranges. Summary: This patch introduces two new iterator ranges and updates existing code to use it. No functional change intended. Test Plan: All tests (make check-all) still pass. Reviewers: dblaikie Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4481 llvm-svn: 213474	2014-07-20 09:10:11 +00:00
Matt Arsenault	4100ebd67b	R600: Add missing test for concat_vectors llvm-svn: 213473	2014-07-20 07:13:17 +00:00
Matt Arsenault	0163e033e2	R600: Remove unused function llvm-svn: 213472	2014-07-20 06:31:06 +00:00
Matt Arsenault	e261b6e853	R600/SI: Remove dead code and add missing tests. This probably was killed by some generic DAGCombiner improvements in checking the TargetBooleanContents instead of just 1. llvm-svn: 213471	2014-07-20 06:11:02 +00:00
Saleem Abdulrasool	6747c7d01b	linux process: silence GCC switch coverage warning Add missing entry for eExecMessage message type to silence GCC switch coverage warning. llvm-svn: 213470	2014-07-20 05:28:57 +00:00
Saleem Abdulrasool	e1401eb747	build: fix cmake warning with newer CMake Hoist the compatibility macros out a level and re-use them when adding link dependencies. Silences a warning from CMake. llvm-svn: 213469	2014-07-20 05:28:55 +00:00
Bill Wendling	03074dd83e	Update formatting with clang-format. llvm-svn: 213468	2014-07-20 05:28:52 +00:00
Saleem Abdulrasool	8817bfe7e2	ARM: fix division in some cases For ARM cores that are ARMv6T2+ but not ARMv7ve or ARMv7-r and not an updated ARMv7-a that has the idiv extension (chips with clz but not idiv), an incorrect jump would be calculated due to the preference to thumb instructions over ARM. Rather than computing the target at runtime, use a jumptable instead. This trades a bit of storage for performance. The overhead is 32-bytes for each of the three routines, but avoid the calculation of the offset. Because clz was introduced in ARMv6T2 and idiv in certain versions of ARMv7, the non-clz, non-idiv case implies a target which does not support Thumb-2, and thus we cannot use Thumb on those targets (as it is unlikely that the assembly will assemble). Take the opportunity to refactor the IT block macros into assembly.h rather than redefining them in the TUs where they are used. Existing tests cover the full change already, so no new tests are added. This effectively reverts SVN r213309. llvm-svn: 213467	2014-07-20 04:44:21 +00:00
NAKAMURA Takumi	8eb82fc453	Fix msc17 build. RegionInfo::RegionInfo::recalculate() doesn't make sense. llvm-svn: 213466	2014-07-20 03:57:51 +00:00
NAKAMURA Takumi	118b0c789d	Fix -Asserts build introduced since r213456. llvm-svn: 213465	2014-07-20 00:00:42 +00:00
David Blaikie	ba80ee392a	Sure up ownership passing of the PBQPBuilder by passing unique_ptrs by value rather than lvalue reference. Also removes an unnecessary '.release()' that should've been a std::move anyway. (I'm on a hunt for '.release()' calls) llvm-svn: 213464	2014-07-19 21:19:45 +00:00
Saleem Abdulrasool	00426d9c19	MC: permit emitting a symbol value as section relative This adds an optional parameter to the EmitSymbolValue method in MCStreamer to permit emitting a symbol value as a section relative value. This is to cover the use in MCDwarf which should not really know about how to emit a section relative value for a given target. This addresses post-review comments from Eric Christopher in SVN r213275. llvm-svn: 213463	2014-07-19 21:01:58 +00:00
Simon Atanasyan	5ecadda642	[Mips] Replace assembler code by YAML to make the test 'dynlib-dynamic.test' target independent. llvm-svn: 213462	2014-07-19 20:18:46 +00:00
Matt Arsenault	1c407fb5a3	Revert accidentally committed r213459 llvm-svn: 213461	2014-07-19 19:17:33 +00:00
Matt Arsenault	1b54c238b6	Fix build with GCC. Seems like a bug in either GCC or clang, but I'm not sure which is right. llvm-svn: 213460	2014-07-19 19:16:36 +00:00
Matt Arsenault	b38677ee2f	XXX - Increase unroll threshold llvm-svn: 213459	2014-07-19 19:16:34 +00:00
Matt Arsenault	ad14ce84b7	R600/SI: implement range reduction for sin/cos These instructions can only take a limited input range, and return the constant value 1 out of range. We should do range reduction to be able to process arbitrary values. Use a FRACT instruction after normalization to achieve this. Also add a test for constant folding with the lowered code with unsafe-fp-math enabled. v2: use DAG lowering instead of intrinsic, adapt test v3: calculate constant, fold pattern into instruction definition v4: misc style fixes, add sin-fold testcase, cosmetics Patch by Grigori Goronzy llvm-svn: 213458	2014-07-19 18:44:39 +00:00
Matt Arsenault	8ca36815ee	Update for RegionInfo changes. Mostly related to missing includes and renaming of the pass to RegionInfoPass. llvm-svn: 213457	2014-07-19 18:40:17 +00:00
Matt Arsenault	1b8d83796d	Templatify RegionInfo so it works on MachineBasicBlocks llvm-svn: 213456	2014-07-19 18:29:29 +00:00
Matt Arsenault	a93441fe9c	R600: Implement a few simple TTI queries. I'm not sure if these have any effect right now. llvm-svn: 213455	2014-07-19 18:15:16 +00:00
Ben Langmuir	b797d59f03	If a module build reports errors, don't try to load it ... just to find out that it didn't build. llvm-svn: 213454	2014-07-19 16:29:28 +00:00
Hal Finkel	aac5fc9cf8	[LoopVectorize] Use CreateAligned(Load\|Store) IRBuilder has CreateAligned(Load\|Store) functions; use them and we don't need to make a second call to setAlignment. No functionality change intended. llvm-svn: 213453	2014-07-19 13:39:45 +00:00
Hal Finkel	4f7d55aac8	[LoopVectorize] Propagate known metadata to vectorized instructions There are some kinds of metadata that are safe to propagate from the scalar instructions to the vector instructions (fpmath and tbaa currently). Regarding TBAA, one might worry about propagating it on if-converted loads and stores, because the metadata might have had a control dependency on the condition, and thus actually aliased with some other non-speculated memory access when the condition was false. However, this would be caught by the runtime overlap checks. llvm-svn: 213452	2014-07-19 13:33:16 +00:00
Andrea Di Biagio	2aacd94d40	[x86] Fix wrong shuffle mask in test 'combine-vec-shuffle-3.ll'. No functional change. Function @test3c should check that the DAGCombiner is able to fold a pair of shuffles into a new shuffle with a permute mask of <6,7,2,3>. However, one of the shuffles in @test3c had a wrong permute mask; this prevented the DAGCombiner from folding the shuffles into the expected result. Now that the shuffle mask is fixed, the backend correctly folds the two shuffles in function @test3c into a single movhlps instruction. llvm-svn: 213451	2014-07-19 07:52:58 +00:00
Viktor Kutuzov	99400a5a34	Revert D3908 due to issues on Mac platforms llvm-svn: 213450	2014-07-19 05:58:38 +00:00
Hal Finkel	b8e7c736fb	Handle AddrSpaceCast in stripAndAccumulateInBoundsConstantOffsets All of the other similar functions in that part of the file look through addrspacecast in addition to bitcast, and I see no reason why stripAndAccumulateInBoundsConstantOffsets shouldn't do so also. llvm-svn: 213449	2014-07-19 03:32:02 +00:00
NAKAMURA Takumi	ab184fb88d	MergedLoadStoreMotion.cpp: Fix msc17 build. Member initializer is unavailable. llvm-svn: 213448	2014-07-19 03:29:25 +00:00
Hal Finkel	9e440c08a9	Make Value::isDereferenceablePointer handle offsets to pointer types with dereferenceable attributes When we have a parameter (or call site return) with a dereferenceable attribute, it can specify the size of an array pointed to by that parameter. If we have a value for which we can accumulate a constant offset to such a parameter, then we can use that offset in a direct comparison with the size specified by the dereferenceable attribute. This enables us to handle cases like this: int foo(int a[static 3]) { return a[2]; /* this is always dereferenceable */ } llvm-svn: 213447	2014-07-19 03:25:16 +00:00
Hal Finkel	16e394a36c	Cleanup comparisons to VariableArrayType::Static for non-VLAs The enum is part of ArrayType, so there is no functional change, but comparing to ArrayType::Static for non-VLAs makes more sense. llvm-svn: 213446	2014-07-19 02:13:40 +00:00
Hal Finkel	bfe2d3c0f9	TypePrinter should not ignore IndexTypeCVRQualifiers on constant-sized arrays C99 array parameters can have index-type CVR qualifiers, and the TypePrinter should print them when present (and we were not for constant-sized arrays). Otherwise, we'd drop the restrict in: int foo(int a[restrict static 3]) { ... } llvm-svn: 213445	2014-07-19 02:01:03 +00:00
Hal Finkel	48d53e2c4c	Use the dereferenceable attribute on C99 array parameters with static In C99, an array parameter declarator might have the form: direct-declarator '[' 'static' type-qual-list[opt] assign-expr ']' where the static keyword indicates that the caller will always provide a pointer to the beginning of an array with at least the number of elements specified by the assignment expression. For constant sizes, we can use the new dereferenceable attribute to pass this information to the optimizer. For VLAs, we don't know the size, but (for addrspace(0)) do know that the pointer must be nonnull (and so we can use the nonnull attribute). llvm-svn: 213444	2014-07-19 01:41:07 +00:00
Richard Smith	1b98ccc4e9	PR20356: Fix all Sema warnings with mismatched ext_/warn_ versus ExtWarn/Warnings. Mostly the name of the warning was changed to match the semantics, but in the PR20356 cases, the warning was about valid code, so the diagnostic was changed from ExtWarn to Warning instead. llvm-svn: 213443	2014-07-19 01:39:17 +00:00
Saleem Abdulrasool	c4e00289a7	ARM: correct WoA __builtin_alloca handling on O0 When performing a dynamic stack adjustment without optimisations, we would mark SP as def and R4 as kill. This occurred as part of the expansion of a WIN__CHKSTK SDNode which indicated the proper handling of SP and R4. The result would be that we would double define SP as part of an operation, which is obviously incorrect. Furthermore, the VTList for the chain had an incorrect parameter type of i32 instead of Other. Correct these to permit proper lowering of __builtin_alloca at -O0. llvm-svn: 213442	2014-07-19 01:29:51 +00:00
NAKAMURA Takumi	6096d44f76	clang/test/Misc/backend-optimization-failure.cpp: Appease to add -triple=x86_64. FIXME: Could this be made generic? llvm-svn: 213441	2014-07-19 01:17:32 +00:00
Jim Ingham	4ac0443fd9	Add the ability to suppress the creation of a persistent result variable and use in in "Process::LoadImage" so that, for instance, "process load" doesn't increment the return variable number. llvm-svn: 213440	2014-07-19 01:09:16 +00:00

1 2 3 4 5 ...

178857 Commits All Branches Search

178857 Commits

All Branches