llvm-project

Commit Graph

Author	SHA1	Message	Date
Eric Christopher	457864178f	Remove a call to TargetMachine::getSubtarget from the inline asm support in the asm printer. If we can get a subtarget from the machine function then we should do so, otherwise we can go ahead and create a default one since we're at the module level. llvm-svn: 229916	2015-02-19 21:24:23 +00:00
Colin LeMahieu	1174fea31c	[Hexagon] Moving remaining methods off of HexagonMCInst in to HexagonMCInstrInfo and eliminating HexagonMCInst class. llvm-svn: 229914	2015-02-19 21:10:50 +00:00
Benjamin Kramer	68ca67b212	MC: Allow multiple comma-separated expressions on the .uleb128 directive. For compatiblity with GNU as. Binutils documents this as '.uleb128 expressions'. Subtle, isn't it? llvm-svn: 229911	2015-02-19 20:24:04 +00:00
Benjamin Kramer	dfedfeb298	SSAUpdater: Use range-based for. NFC. llvm-svn: 229908	2015-02-19 20:04:02 +00:00
Eric Christopher	64d35be6d6	Remove unused argument from emitInlineAsmStart. llvm-svn: 229907	2015-02-19 19:52:25 +00:00
Michael Gottesman	2e0e4e07b4	[objc-arc] Convert the bodies of ARCInstKind predicates into covered switches. This is much better than the previous manner of just using short-curcuiting booleans from: 1. A "naive" efficiency perspective: we do not have to rely on the compiler to change the short circuiting boolean operations into a switch. 2. An understanding perspective by making the implicit behavior of negative predicates explicit. 3. A maintainability perspective through the covered switch flag making it easy to know where to update code when adding new ARCInstKinds. llvm-svn: 229906	2015-02-19 19:51:36 +00:00
Michael Gottesman	6f729fa675	[objc-arc] Change the InstructionClass to be an enum class called ARCInstKind. I also renamed ObjCARCUtil.cpp -> ARCInstKind.cpp. That file only contained items related to ARCInstKind anyways. llvm-svn: 229905	2015-02-19 19:51:32 +00:00
Chris Bieneman	a747e5935d	Checking if TARGET_OS_IPHONE is defined isn't good enough for 10.7 and earlier. Older versions of the TargetConditionals header always defined TARGET_OS_IPHONE to something (0 or 1), so we need to test not only for the existence but also if it is 1. This resolves PR22631. llvm-svn: 229904	2015-02-19 19:50:52 +00:00
Colin LeMahieu	745c4710db	[Hexagon] Moving more functions off of HexagonMCInst and in to HexagonMCInstrInfo. llvm-svn: 229903	2015-02-19 19:49:27 +00:00
Adam Nemet	57ac766ee9	[LoopAccesses] Change LAA:getInfo to return a constant reference As expected, this required a few more const-correctness fixes. Based on Hal's feedback on D7684. llvm-svn: 229899	2015-02-19 19:15:21 +00:00
Adam Nemet	e91cc6ef93	[LoopAccesses] Add -analyze support The LoopInfo in combination with depth_first is used to enumerate the loops. Right now -analyze is not yet complete. It only prints the result of the analysis, the report and the run-time checks. Printing the unsafe depedences will require a bit more reshuffling which I'd like to do in a follow-on to this patchset. Unsafe dependences are currently checked via -debug-only=loop-accesses in the new test. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229898	2015-02-19 19:15:19 +00:00
Adam Nemet	2bd6e984ef	[LoopAccesses] Split out LoopAccessReport from VectorizerReport The only difference between these two is that VectorizerReport adds a vectorizer-specific prefix to its messages. When LAA is used in the vectorizer context the prefix is added when we promote the LoopAccessReport into a VectorizerReport via one of the constructors. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229897	2015-02-19 19:15:15 +00:00
Adam Nemet	3e87634fd8	[LoopAccesses] Add missing const to APIs in VectorizationReport When I split out LoopAccessReport from this, I need to create some temps so constness becomes necessary. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229896	2015-02-19 19:15:13 +00:00
Adam Nemet	929c38e8ff	[LoopAccesses] Add canAnalyzeLoop This allows the analysis to be attempted with any loop. This feature will be used with -analysis. (LV only requests the analysis on loops that have already satisfied these tests.) This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229895	2015-02-19 19:15:10 +00:00
Adam Nemet	339f42b396	[LoopAccesses] Change debug messages from LV to LAA Also add pass name as an argument to VectorizationReport::emitAnalysis. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229894	2015-02-19 19:15:07 +00:00
Adam Nemet	3bfd93d789	[LoopAccesses] Create the analysis pass This is a function pass that runs the analysis on demand. The analysis can be initiated by querying the loop access info via LAA::getInfo. It either returns the cached info or runs the analysis. Symbolic stride information continues to reside outside of this analysis pass. We may move it inside later but it's not a priority for me right now. The idea is that Loop Distribution won't support run-time stride checking at least initially. This means that when querying the analysis, symbolic stride information can be provided optionally. Whether stride information is used can invalidate the cache entry and rerun the analysis. Note that if the loop does not have any symbolic stride, the entry should be preserved across Loop Distribution and LV. Since currently the only user of the pass is LV, I just check that the symbolic stride information didn't change when using a cached result. On the LV side, LoopVectorizationLegality requests the info object corresponding to the loop from the analysis pass. A large chunk of the diff is due to LAI becoming a pointer from a reference. A test will be added as part of the -analyze patch. Also tested that with AVX, we generate identical assembly output for the testsuite (including the external testsuite) before and after. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229893	2015-02-19 19:15:04 +00:00
Adam Nemet	436018c3ff	[LoopAccesses] Cache the result of canVectorizeMemory LAA will be an on-demand analysis pass, so we need to cache the result of the analysis. canVectorizeMemory is renamed to analyzeLoop which computes the result. canVectorizeMemory becomes the query function for the cached result. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229892	2015-02-19 19:15:00 +00:00
Adam Nemet	c922853b93	[LoopAccesses] Stash the report from the analysis rather than emitting it The transformation passes will query this and then emit them as part of their own report. The currently only user LV is modified to do just that. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229891	2015-02-19 19:14:56 +00:00
Adam Nemet	f219c64723	[LoopAccesses] Make VectorizerParams global + fix for cyclic dep As LAA is becoming a pass, we can no longer pass the params to its constructor. This changes the command line flags to have external storage. These can now be accessed both from LV and LAA. VectorizerParams is moved out of LoopAccessInfo in order to shorten the code to access it. This commits also has the fix (D7731) to the break dependence cycle between the analysis and vector libraries. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229890	2015-02-19 19:14:52 +00:00
Adam Nemet	04d4163e95	Revert "Reformat." This reverts commit r229651. I'd like to ultimately revert r229650 but this reformat stands in the way. I'll reformat the affected files once the the loop-access pass is fully committed. llvm-svn: 229889	2015-02-19 19:14:34 +00:00
Colin LeMahieu	af304e5192	[Hexagon] Creating HexagonMCInstrInfo namespace as landing zone for static functions detached from HexagonMCInst. llvm-svn: 229885	2015-02-19 19:00:00 +00:00
Eric Christopher	504f388a84	Update and remove a few calls to TargetMachine::getSubtargetImpl out of the asm printer. llvm-svn: 229883	2015-02-19 18:46:23 +00:00
Kostya Serebryany	016852c396	[fuzzer] split main() into FuzzerDriver() that takes a callback as a parameter and a tiny main() in a separate file llvm-svn: 229882	2015-02-19 18:45:37 +00:00
Ben Langmuir	0897091730	Assume the original file is created before release in LockFileManager This is true in clang, and let's us remove the problematic code that waits around for the original file and then times out if it doesn't get created in short order. This caused any 'dead' lock file or legitimate time out to cause a cascade of timeouts in any processes waiting on the same lock (even if they only just showed up). llvm-svn: 229881	2015-02-19 18:22:35 +00:00
Kostya Serebryany	2117269dd1	[fuzzer] properly annotate fallthrough, add one more entry to FAQ llvm-svn: 229880	2015-02-19 18:21:12 +00:00
Colin LeMahieu	f08a3ccf50	[Hexagon] Removing static variable holding MCInstrInfo. llvm-svn: 229872	2015-02-19 17:38:39 +00:00
Benjamin Kramer	1c2beed7fd	LSR: Move set instead of copying. NFC. llvm-svn: 229871	2015-02-19 17:19:43 +00:00
Rafael Espindola	8c97e19124	Avoid conversion to float when creating ConstantDataArray/ConstantDataVector. Patch by Raoux, Thomas F! llvm-svn: 229864	2015-02-19 16:08:20 +00:00
Benjamin Kramer	ea68a944a1	Demote vectors to arrays. No functionality change. llvm-svn: 229861	2015-02-19 15:26:17 +00:00
Chandler Carruth	5d1a84b7b8	[x86] Delete still more piles of complex code now that we have a good systematic lowering of v8i16. This required a slight strategy shift to prefer unpack lowerings in more places. While this isn't a cut-and-dry win in every case, it is in the overwhelming majority. There are only a few places where the old lowering would probably be a touch faster, and then only by a small margin. In some cases, this is yet another significant improvement. llvm-svn: 229859	2015-02-19 15:21:57 +00:00
Chandler Carruth	0b39536390	[x86] Teach the unpack lowering how to lower with an initial unpack in addition to lowering to trees rooted in an unpack. This saves shuffles and or registers in many various ways, lets us handle another class of v4i32 shuffles pre SSE4.1 without domain crosses, etc. llvm-svn: 229856	2015-02-19 15:06:13 +00:00
Chandler Carruth	352eba1c29	[x86] Dramatically improve v8i16 shuffle lowering by not using its terribly complex partial blend logic. This code path was one of the more complex and bug prone when it first went in and it hasn't faired much better. Ultimately, with the simpler basis for unpack lowering and support bit-math blending, this is completely obsolete. In the worst case without this we generate different but equivalent instructions. However, in many cases we generate much better code. This is especially true when blends or pshufb is available. This does expose one (minor) weakness of the unpack lowering that I'll try to address. In case you were wondering, this is actually a big part of what I've been trying to pull off in the recent string of commits. llvm-svn: 229853	2015-02-19 14:08:24 +00:00
Chandler Carruth	2c0390ca4b	[x86] Remove the final fallback in the v8i16 lowering that isn't really needed, and significantly improve the SSSE3 path. This makes the new strategy much more clear. If we can blend, we just go with that. If we can't blend, we try to permute into an unpack so that we handle cases where the unpack doing the blend also simplifies the shuffle. If that fails and we've got SSSE3, we now call into factored-out pshufb lowering code so that we leverage the fact that pshufb can set up a blend for us while shuffling. This generates great code, especially because we know we don't have a fast blend at this point. Finally, we fall back on decomposing into permutes and blends because we do at least have a bit-math-based blend if we need to use that. This pretty significantly improves some of the v8i16 code paths. We never need to form pshufb for the single-input shuffles because we have effective target-specific combines to form it there, but we were missing its effectiveness in the blends. llvm-svn: 229851	2015-02-19 13:56:49 +00:00
Chandler Carruth	f0f0d27391	[x86] Simplify the pre-SSSE3 v16i8 lowering significantly by decomposing them into permutes and a blend with the generic decomposition logic. This works really well in almost every case and lets the code only manage the expansion of a single input into two v8i16 vectors to perform the actual shuffle. The blend-based merging is often much nicer than the pack based merging that this replaces. The only place where it isn't we end up blending between two packs when we could do a single pack. To handle that case, just teach the v2i64 lowering to handle these blends by digging out the operands. With this we're down to only really random permutations that cause an explosion of instructions. llvm-svn: 229849	2015-02-19 13:15:12 +00:00
Chandler Carruth	8817e5e01b	[x86] Remove the insanely over-aggressive unpack lowering strategy for v16i8 shuffles, and replace it with new facilities. This uses precise patterns to match exact unpacks, and the new generalized unpack lowering only when we detect a case where we will have to shuffle both inputs anyways and they terminate in exactly a blend. This fixes all of the blend horrors that I uncovered by always lowering blends through the vector shuffle lowering. It also removes sooooo much of the crazy instruction sequences required for v16i8 lowering previously. Much cleaner now. The only "meh" aspect is that we sometimes use pshufb+pshufb+unpck when it would be marginally nicer to use pshufb+pshufb+por. However, the difference there is tiny. In many cases its a win because we re-use the pshufb mask. In others, we get to avoid the pshufb entirely. I've left a FIXME, but I'm dubious we can really do better than this. I'm actually pretty happy with this lowering now. For SSE2 this exposes some horrors that were really already there. Those will have to fixed by changing a different path through the v16i8 lowering. llvm-svn: 229846	2015-02-19 12:10:37 +00:00
Jozef Kolek	5d171fc291	[mips][microMIPS] Make usage of AND16, OR16 and XOR16 by code generator Differential Revision: http://reviews.llvm.org/D7611 llvm-svn: 229845	2015-02-19 11:51:32 +00:00
Chandler Carruth	38dea42ddf	[x86] The SELECT x86 DAG combine also does legalization. It used to rely on things not being marked as either custom or legal, but we now do custom lowering of more VSELECT nodes. To cope with this, manually replicate the legality tests here. These have to stay in sync with the set of tests used in the custom lowering of VSELECT. Ideally, we wouldn't do any of this combine-based-legalization when we have an actual custom legalization step for VSELECT, but I'm not going to be able to rewrite all of that today. I don't have a test case for this currently, but it was found when compiling a number of the test-suite benchmarks. I'll try to reduce a test case and add it. This should at least fix the test-suite fallout on build bots. llvm-svn: 229844	2015-02-19 11:43:37 +00:00
Michael Kuperstein	efd7a96d2e	Reverting r229831 due to multiple ARM/PPC/MIPS build-bot failures. llvm-svn: 229841	2015-02-19 11:38:11 +00:00
Igor Laevsky	9570ff94f7	Implement invoke statepoint verification. Differential Revision: http://reviews.llvm.org/D7366 llvm-svn: 229840	2015-02-19 11:28:47 +00:00
Igor Laevsky	77f118f878	Add invoke related functionality into StatepointSite classes. Differential Revision: http://reviews.llvm.org/D7364 llvm-svn: 229838	2015-02-19 11:02:11 +00:00
Elena Demikhovsky	69e8b45b13	AVX-512: Full implementation for VRNDSCALESS/SD instructions and intrinsics. llvm-svn: 229837	2015-02-19 10:48:04 +00:00
Chandler Carruth	bcb6c5f62d	[x86] Add support for bit-wise blending and use it in the v8 and v16 lowering paths. I'm going to be leveraging this to simplify a lot of the overly complex lowering of v8 and v16 shuffles in pre-SSSE3 modes. Sadly, this isn't profitable on v4i32 and v2i64. There, the float and double blending instructions for pre-SSE4.1 are actually pretty good, and we can't beat them with bit math. And once SSE4.1 comes around we have direct blending support and this ceases to be relevant. Also, some of the test cases look odd because the domain fixer canonicalizes these to floating point domain. That's OK, it'll use the integer domain when it matters and some day I may be able to update enough of LLVM to canonicalize the other way. This restores almost all of the regressions from teaching x86's vselect lowering to always use vector shuffle lowering for blends. The remaining problems are because the v16 lowering path is still doing crazy things. I'll be re-arranging that strategy in more detail in subsequent commits to finish recovering the performance here. llvm-svn: 229836	2015-02-19 10:46:52 +00:00
Chandler Carruth	b89464a9b6	[x86,sdag] Two interrelated changes to the x86 and sdag code. First, don't combine bit masking into vector shuffles (even ones the target can handle) once operation legalization has taken place. Custom legalization of vector shuffles may exist for these patterns (making the predicate return true) but that custom legalization may in some cases produce the exact bit math this matches. We only really want to handle this prior to operation legalization. However, the x86 backend, in a fit of awesome, relied on this. What it would do is mark VSELECTs as expand, which would turn them into arithmetic, which this would then match back into vector shuffles, which we would then lower properly. Amazing. Instead, the second change is to teach the x86 backend to directly form vector shuffles from VSELECT nodes with constant conditions, and to mark all of the vector types we support lowering blends as shuffles as custom VSELECT lowering. We still mark the forms which actually support variable blends as legal so that the custom lowering is bypassed, and the legal lowering can even be used by the vector shuffle legalization (yes, i know, this is confusing. but that's how the patterns are written). This makes the VSELECT lowering much more sensible, and in fact should fix a bunch of bugs with it. However, as you'll see in the test cases, right now what it does is point out the hilarious deficiency of the new vector shuffle lowering when it comes to blends. Fortunately, my very next patch fixes that. I can't submit it yet, because that patch, somewhat obviously, forms the exact and/or pattern that the DAG combine is matching here! Without this patch, teaching the vector shuffle lowering to produce the right code infloops in the DAG combiner. With this patch alone, we produce terrible code but at least lower through the right paths. With both patches, all the regressions here should be fixed, and a bunch of the improvements (like using 2 shufps with no memory loads instead of 2 andps with memory loads and an orps) will stay. Win! There is one other change worth noting here. We had hilariously wrong vectorization cost estimates for vselect because we fell through to the code path that assumed all "expand" vector operations are scalarized. However, the "expand" lowering of VSELECT is vector bit math, most definitely not scalarized. So now we go back to the correct if horribly naive cost of "1" for "not scalarized". If anyone wants to add actual modeling of shuffle costs, that would be cool, but this seems an improvement on its own. Note the removal of 16 and 32 "costs" for doing a blend. Even in SSE2 we can blend in fewer than 16 instructions. ;] Of course, we don't right now because of OMG bad code, but I'm going to fix that. Next patch. I promise. llvm-svn: 229835	2015-02-19 10:36:19 +00:00
Michael Kuperstein	ba5b04c798	Use std::bitset for SubtargetFeatures Previously, subtarget features were a bitfield with the underlying type being uint64_t. Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset. No functional change. Differential Revision: http://reviews.llvm.org/D7065 llvm-svn: 229831	2015-02-19 09:01:04 +00:00
Davide Italiano	faafae33fa	[Support/Timer] Make GetMallocUsage() aware of jemalloc. Differential Revision: D7657 Reviewed by: shankarke, majnemer llvm-svn: 229824	2015-02-19 07:27:14 +00:00
Dmitri Gribenko	3e1551c96f	Provide the same ABI regardless of NDEBUG For projects depending on LLVM, I find it very useful to combine a release-no-asserts build of LLVM with a debug+asserts build of the dependent project. The motivation is that when developing a dependent project, you are debugging that project itself, not LLVM. In my usecase, a significant part of the runtime is spent in LLVM optimization passes, so I would like to build LLVM without assertions to get the best performance from this combination. Currently, `lib/Support/Debug.cpp` changes the set of symbols it provides depending on NDEBUG, while `include/llvm/Support/Debug.h` requires extra symbols when NDEBUG is not defined. Thus, it is not possible to enable assertions in an external project that uses facilities of `Debug.h`. This patch changes `Debug.cpp` and `Valgrind.cpp` to always define the symbols that other code may depend on when #including LLVM headers without NDEBUG. http://reviews.llvm.org/D7662 llvm-svn: 229819	2015-02-19 05:30:16 +00:00
Eric Christopher	d84f5d30e2	Remove the local subtarget variable from the SystemZ asm printer and update the two calls accordingly. llvm-svn: 229805	2015-02-19 01:26:28 +00:00
Eric Christopher	0795a2ef0c	Remove a few more calls to TargetMachine::getSubtarget from the R600 port. llvm-svn: 229804	2015-02-19 01:10:55 +00:00
Eric Christopher	7edca437f5	Grab the subtarget off of the machine function for the R600 asm printer and clean up a bunch of uses. llvm-svn: 229803	2015-02-19 01:10:53 +00:00
Eric Christopher	96caeda730	Remove the DisasmEnabled AsmPrinter variable and just look it up on the subtarget where it's set anyhow than looking it up 2-3 times in the same place. llvm-svn: 229802	2015-02-19 01:10:49 +00:00
Peter Collingbourne	fb8002cbe0	MC: Remove NullStreamer hook, as it is redundant with NullTargetStreamer. llvm-svn: 229799	2015-02-19 00:45:07 +00:00
Peter Collingbourne	20c7259ce9	Introduce Target::createNullTargetStreamer and use it from IRObjectFile. A null MCTargetStreamer allows IRObjectFile to ignore target-specific directives. Previously we were crashing. Differential Revision: http://reviews.llvm.org/D7711 llvm-svn: 229797	2015-02-19 00:45:02 +00:00
Michael Gottesman	e5ad66f8a9	[objc-arc] Introduce the concept of RCIdentity and rename all relevant functions to use that name. NFC. The RCIdentity root ("Reference Count Identity Root") of a value V is a dominating value U for which retaining or releasing U is equivalent to retaining or releasing V. In other words, ARC operations on V are equivalent to ARC operations on U. This is a useful property to ascertain since we can use this in the ARC optimizer to make it easier to match up ARC operations by always mapping ARC operations to RCIdentityRoots instead of pointers themselves. Then we perform pairing of retains, releases which are applied to the same RCIdentityRoot. In general, the two ways that we see RCIdentical values in ObjC are via: 1. PointerCasts 2. Forwarding Calls that return their argument verbatim. As such in ObjC, two RCIdentical pointers must always point to the same memory location. Previously this concept was implicit in the code and various methods that dealt with this concept were given functional names that did not conform to any name in the "ARC" model. This often times resulted in code that was hard for the non-ARC acquanted to understand resulting in unhappiness and confusion. llvm-svn: 229796	2015-02-19 00:42:38 +00:00
Michael Gottesman	dfa3e4b08a	[objc-arc-contract] Rename contractRelease => tryToContractReleaseIntoStoreStrong. NFC. Makes it clearer what this method is actually supposed to do. llvm-svn: 229795	2015-02-19 00:42:34 +00:00
Michael Gottesman	1827973f80	[objc-arc-contract] Refactor out tryToPeepholeInstruction into its own method. NFC. The main method of ObjCARCContract is really large and busy. By refactoring this out, it becomes easier to reason about. llvm-svn: 229794	2015-02-19 00:42:30 +00:00
Michael Gottesman	56bd6a077a	[objc-arc-contract] Reorganize the code a bit and make the debug output easier to read. llvm-svn: 229793	2015-02-19 00:42:27 +00:00
Duncan P. N. Exon Smith	3d62bbacb1	IR: Drop scope from MDTemplateParameter Follow-up to r229740, which removed `DITemplate*::getContext()` after my upgrade script revealed that scopes are always `nullptr` for template parameters. This is the other shoe: drop `scope:` from `MDTemplateParameter` and its two subclasses. (Note: a bitcode upgrade would be pointless, since the hierarchy hasn't been moved into place.) llvm-svn: 229791	2015-02-19 00:37:21 +00:00
Eric Christopher	ca929f2469	Avoid using a self-referential initializer and fix up uses. llvm-svn: 229790	2015-02-19 00:22:47 +00:00
Eric Christopher	111de895a0	80-column fixups. llvm-svn: 229789	2015-02-19 00:15:33 +00:00
Eric Christopher	02389e3886	Remove all use of is64bit off of NVPTXSubtarget and clean up code accordingly. This changes the constructors of a number of classes that don't need to know the subtarget's 64-bitness. llvm-svn: 229787	2015-02-19 00:08:27 +00:00
Eric Christopher	beffc4e84f	Remove all use of getDrvInterface off of NVPTXSubtarget and clean up code accordingly. Delete code that was checking for all cases of an enum. llvm-svn: 229786	2015-02-19 00:08:23 +00:00
Eric Christopher	6aad8b1801	Migrate the NVPTX backend asm printer to a per function subtarget. This involved moving two non-subtarget dependent features (64-bitness and the driver interface) to the NVPTX target machine and updating the uses (or migrating around the subtarget use for ease of review). Otherwise use the cached subtarget or create a default subtarget based on the TargetMachine cpu and feature string for the module level assembler emission. llvm-svn: 229785	2015-02-19 00:08:14 +00:00
Duncan P. N. Exon Smith	5c9a17732b	IR: Allow MDSubrange to have 'count: -1' It turns out that `count: -1` is a special value indicating an empty array, such as `Values` in: struct T { unsigned Count; int Values[]; }; Handle it. llvm-svn: 229769	2015-02-18 23:17:51 +00:00
Reid Kleckner	7bb0738d82	Add an IR-to-IR test for dwarf EH preparation using opt This tests the simple resume instruction elimination logic that we have before making some changes to it. llvm-svn: 229768	2015-02-18 23:17:41 +00:00
Andrew Kaylor	179543bb9b	Style and formatting fixes for r229715 llvm-svn: 229758	2015-02-18 22:52:18 +00:00
Marek Olsak	9b8f32eed1	R600/SI: Fix READLANE and WRITELANE lane select for VI VOP2 declares vsrc1, but VOP3 declares src1. We can't use the same "ins" if the operands have different names in VOP2 and VOP3 encodings. This fixes a hang in geometry shaders which spill M0 on VI. (BTW it doesn't look like M0 needs spilling and the spilling seems duplicated 3 times) llvm-svn: 229752	2015-02-18 22:12:45 +00:00
Marek Olsak	8eeebcccb5	R600/SI: Simplify verification of AMDGPU::OPERAND_REG_INLINE_C llvm-svn: 229751	2015-02-18 22:12:41 +00:00
Marek Olsak	b8c818337d	R600/SI: Remove explicit VOP operand checking This should be handled by the OperandType checking. llvm-svn: 229750	2015-02-18 22:12:37 +00:00
Duncan P. N. Exon Smith	cd8fb60fce	IR: Swap order of name and value in MDEnum Put the name before the value in assembly for `MDEnum`. While working on the testcase upgrade script for the new hierarchy, I noticed that it "looks nicer" to have the name first, since it lines the names up in the (somewhat typical) case that they have a common prefix. llvm-svn: 229747	2015-02-18 21:16:33 +00:00
Duncan P. N. Exon Smith	df52349bb0	IR: Add MDSubprogram::replaceFunction() llvm-svn: 229742	2015-02-18 20:32:57 +00:00
Duncan P. N. Exon Smith	89b075e53a	IR: Drop the scope in DI template parameters The scope/context is always the compile unit, which we replace with `nullptr` anyway (via `getNonCompileUnitScope()`). Drop it explicitly. I noticed this field was always null while writing testcase upgrade scripts to transition to the new hierarchy. Seems wasteful to transition it over if it's already out-of-use. llvm-svn: 229740	2015-02-18 20:30:45 +00:00
Duncan P. N. Exon Smith	e4450146fa	Fix -DNDEBUG -Werror build after r229733 llvm-svn: 229736	2015-02-18 19:56:50 +00:00
Reid Kleckner	4dd0304e34	dos2unix the WinEH file and tests llvm-svn: 229735	2015-02-18 19:52:46 +00:00
Duncan P. N. Exon Smith	8551d25fa9	IR: isScopeRef() should check isScope() r229733 removed an invalid use of `DIScopeRef`, so now we can enforce that a `DIScopeRef` is actually a scope. llvm-svn: 229734	2015-02-18 19:46:02 +00:00
Duncan P. N. Exon Smith	2a78e9bcb5	IR: Avoid DIScopeRef in DIImportedEntity::getEntity() `DIImportedEntity::getEntity()` currently returns a `DIScopeRef`, but the nodes it references aren't always `DIScope`s. In particular, it can reference global variables. Introduce `DIDescriptorRef` to avoid the lie. llvm-svn: 229733	2015-02-18 19:39:36 +00:00
Sanjoy Das	11b279a832	Partial fix for bug 22589 Don't spend the entire iteration space in the scalar loop prologue if computing the trip count overflows. This change also gets rid of the backedge check in the prologue loop and the extra check for overflowing trip-count. Differential Revision: http://reviews.llvm.org/D7715 llvm-svn: 229731	2015-02-18 19:32:25 +00:00
Justin Bogner	11ae7789ba	InstrProf: Don't combine expansion regions with code regions This was leading to duplicate counts when a code region happened to overlap exactly with an expansion. The combining behaviour only makes sense for code regions. llvm-svn: 229723	2015-02-18 19:01:06 +00:00
David Blaikie	30f2f3fc98	Remove unused member variables (-Wunused-private-field) llvm-svn: 229722	2015-02-18 18:52:49 +00:00
Justin Bogner	428c605dff	InstrProf: Handle unknown functions if they consist only of zero-regions This comes up when we generate coverage for a function but don't end up emitting the function at all - dead static functions or inline functions that aren't referenced in a particular TU, for example. In these cases we'd like to show that the function was never called, which is trivially true. llvm-svn: 229717	2015-02-18 18:40:46 +00:00
Andrew Kaylor	527c5dc68d	Adding implementation to outline C++ catch handlers for native Windows 64 exception handling. Differential Revision: http://reviews.llvm.org/D7363 llvm-svn: 229715	2015-02-18 18:31:51 +00:00
Justin Bogner	1d29c08095	InstrProf: Make CoverageMapping testable and add a basic unit test Make CoverageMapping easier to create, so that we can write targeted unit tests for its internals, and add a some infrastructure to write these tests. Finally, add a simple unit test for basic functionality. llvm-svn: 229709	2015-02-18 18:01:14 +00:00
Jozef Kolek	3c6724f442	[mips][microMIPS] Make usage of ADDU16 and SUBU16 by code generator Differential Revision: http://reviews.llvm.org/D7609 llvm-svn: 229706	2015-02-18 17:33:56 +00:00
Jozef Kolek	1fd6548297	[mips][microMIPS] Implement JALX instruction Differential Revision: http://reviews.llvm.org/D5047 llvm-svn: 229702	2015-02-18 17:15:48 +00:00
Daniel Sanders	1779314e3c	[mips] Add backend support for Mips32r[35] and Mips64r[35]. Summary: These ISA's didn't add any instructions so they are almost identical to Mips32r2 and Mips64r2. Even the ELF e_flags are the same, However the ISA revision in .MIPS.abiflags is 3 or 5 respectively instead of 2. Reviewers: vmedic Reviewed By: vmedic Subscribers: tomatabacu, llvm-commits, atanasyan Differential Revision: http://reviews.llvm.org/D7381 llvm-svn: 229695	2015-02-18 16:24:50 +00:00
Kit Barton	298beb5e86	This patch adds the VSX logical instructions introduced in the Power ISA 2.07. It also removes the added complexity that favors VMX versions of the three instructions. Phabricator review: http://reviews.llvm.org/D7616 Commiting on Nemanja's behalf. llvm-svn: 229694	2015-02-18 16:21:46 +00:00
Tom Stellard	1ca873bbc5	R600/SI: Don't set isCodeGenOnly = 1 on all instructions We only need to set this on pseudo instructions which won't be used by the assembler. llvm-svn: 229689	2015-02-18 16:08:17 +00:00
Tom Stellard	c34c37ae66	R600/SI: Add missing VOP1 instructions llvm-svn: 229688	2015-02-18 16:08:15 +00:00
Tom Stellard	894b9883f4	R600/SI: Add missing VOP2 instructions llvm-svn: 229687	2015-02-18 16:08:14 +00:00
Tom Stellard	0c0008cb6e	R600/SI: Add definition for S_CBRANCH_G_FORK llvm-svn: 229686	2015-02-18 16:08:13 +00:00
Tom Stellard	ce449ade7e	R600/SI: Add missing SOP1 instructions llvm-svn: 229685	2015-02-18 16:08:11 +00:00
Tom Stellard	ee21faa029	R600/SI: Refactor SOP2 definitions llvm-svn: 229684	2015-02-18 16:08:09 +00:00
Vasileios Kalintiris	611cb70b83	[mips] Avoid redundant sign extension of the result of binary bitwise instructions. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7581 llvm-svn: 229675	2015-02-18 14:57:05 +00:00
Benjamin Kramer	6ca8992018	X86: Use bitset to manage a bag of bits. NFC. Doesn't matter in terms of memory usage or perf here, but it's a neat simplification. llvm-svn: 229672	2015-02-18 14:10:44 +00:00
Toma Tabacu	8874eac5e6	[mips] [IAS] Fix using .cpsetup with local labels (PR22518). Summary: Parse for an MCExpr instead of an Identifier and use the symbol for relocations, not just the symbol's name. This fixes errors when using local labels in .cpsetup (PR22518). Reviewers: dsanders Reviewed By: dsanders Subscribers: seanbruno, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D7697 llvm-svn: 229671	2015-02-18 13:46:53 +00:00
Chandler Carruth	bbb377c3a1	[x86] Tighten the assertions to document that canonicalization has actually removed all but a very small number of choices for v2i64. Also remove dead code handling cases that simply cannot arise. llvm-svn: 229670	2015-02-18 11:46:29 +00:00
Chandler Carruth	811f0ee8c1	[x86] Switch an if which is trivially true to an assert. NFC llvm-svn: 229669	2015-02-18 11:46:27 +00:00
Chandler Carruth	8f3e585b17	[x86] Remove some more 'bit' nomenclature from the generic shift lowering. llvm-svn: 229668	2015-02-18 11:46:23 +00:00
Mohit K. Bhakkad	518946e440	[MSan][MIPS] VarArgHelper for MIPS64 Reviewers: Reviewers: eugenis, kcc, samsonov, petarj Subscribers: dsanders, sagar, llvm-commits Differential Revision: http://reviews.llvm.org/D7182 llvm-svn: 229667	2015-02-18 11:41:24 +00:00
Chandler Carruth	672a98ea28	[x86] Fold together the two shift lowering strategies. They were doing quite literally the same work, we just need to special case the >64-bit element shift code emission to emit the byte shift instructions and offsets. This also makes reasoning about each of the vector lowering strategies easier as we don't have to remember to use both forms. llvm-svn: 229662	2015-02-18 10:40:38 +00:00
Bradley Smith	26c9922a59	[ARM] Add missing M/R class CPUs Add some of the missing M and R class Cortex CPUs, namely: Cortex-M0+ (called Cortex-M0plus for GCC compatibility) Cortex-M1 SC000 SC300 Cortex-R5 llvm-svn: 229660	2015-02-18 10:33:30 +00:00
Michael Kuperstein	af9befa6b7	Fixes two issue in SimplifyDemandedBits of sext_in_reg: 1) We should not try to simplify if the sext has multiple uses 2) There is no need to simplify is the source value is already sign-extended. Patch by Gil Rapaport <gil.rapaport@intel.com> Differential Revision: http://reviews.llvm.org/D6949 llvm-svn: 229659	2015-02-18 09:43:40 +00:00
Ulrich Weigand	b7e5909a42	[SystemZ] Clean up warning Removed (unreachable) default case in switch to clean up warning: lib/Target/SystemZ/SystemZISelLowering.cpp:1974:5: error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default] llvm-svn: 229658	2015-02-18 09:42:23 +00:00
Chandler Carruth	48cc6c623a	[x86] Refactor the bit shift code the same as I just did the byte shift code. While this didn't have the miscompile (it used MatchLeft consistently) it missed some cases where it could use right shifts. I've added a test case Craig Topper came up with to exercise the right shift matching. This code is really identical between the two. I'm going to merge them next so that we don't keep two copies of all of this logic. llvm-svn: 229655	2015-02-18 09:19:58 +00:00
Ulrich Weigand	7db6918e2b	[SystemZ] Support all TLS access models - CodeGen part The current SystemZ back-end only supports the local-exec TLS access model. This patch adds all required CodeGen support for the other TLS models, which means in particular: - Expand initial-exec TLS accesses by loading TLS offsets from the GOT using @indntpoff relocations. - Expand general-dynamic and local-dynamic accesses by generating the appropriate calls to __tls_get_offset. Note that this routine has a non-standard ABI and requires loading the GOT pointer into %r12, so the patch also adds support for the GLOBAL_OFFSET_TABLE ISD node. - Add a new platform-specific optimization pass to remove redundant __tls_get_offset calls in the local-dynamic model (modeled after the corresponding X86 pass). - Add test cases verifying all access models and optimizations. llvm-svn: 229654	2015-02-18 09:13:27 +00:00
Ulrich Weigand	7bdd7c2346	[SystemZ] Support all TLS access models - MC part The current SystemZ back-end only supports the local-exec TLS access model. This patch adds all required MC support for the other TLS models, which means in particular: - Support additional relocation types for Initial-exec model: R_390_TLS_IEENT Local-dynamic-model: R_390_TLS_LDO32, R_390_TLS_LDO64, R_390_TLS_LDM32, R_390_TLS_LDM64, R_390_TLS_LDCALL General-dynamic model: R_390_TLS_GD32, R_390_TLS_GD64, R_390_TLS_GDCALL - Support assembler syntax to generate additional relocations for use with __tls_get_offset calls: :tls_gdcall: :tls_ldcall: The patch also adds a new test to verify fixups and relocations, and removes the (already unused) FK_390_PLT16DBL/FK_390_PLT32DBL fixup kinds. llvm-svn: 229652	2015-02-18 09:11:36 +00:00
NAKAMURA Takumi	a250484c4c	Reformat. llvm-svn: 229651	2015-02-18 08:36:14 +00:00
NAKAMURA Takumi	fa520c5f49	Revert r229622: "[LoopAccesses] Make VectorizerParams global" and others. r229622 brought cyclic dependencies between Analysis and Vector. r229622: "[LoopAccesses] Make VectorizerParams global" r229623: "[LoopAccesses] Stash the report from the analysis rather than emitting it" r229624: "[LoopAccesses] Cache the result of canVectorizeMemory" r229626: "[LoopAccesses] Create the analysis pass" r229628: "[LoopAccesses] Change debug messages from LV to LAA" r229630: "[LoopAccesses] Add canAnalyzeLoop" r229631: "[LoopAccesses] Add missing const to APIs in VectorizationReport" r229632: "[LoopAccesses] Split out LoopAccessReport from VectorizerReport" r229633: "[LoopAccesses] Add -analyze support" r229634: "[LoopAccesses] Change LAA:getInfo to return a constant reference" r229638: "Analysis: fix buildbots" llvm-svn: 229650	2015-02-18 08:34:47 +00:00
Daniel Jasper	ed9eb7209e	NFC: Use range-based for loops and more consistent naming. No functional changes intended. (I plan on doing some modifications to this function and would like to have as few unrelated changes as possible in the patch) llvm-svn: 229649	2015-02-18 08:19:16 +00:00
Daniel Jasper	4d7b04384e	Remove experimental options to control machine block placement. This reverts r226034. Benchmarking with those flags has not revealed anything interesting. llvm-svn: 229648	2015-02-18 08:18:07 +00:00
Sanjoy Das	c1065b9a4f	Address post commit review on r229600. llvm-svn: 229646	2015-02-18 08:03:22 +00:00
Elena Demikhovsky	714f23bcdb	AVX-512: Added support for FP instructions with embedded rounding mode. By Asaf Badouh <asaf.badouh@intel.com> llvm-svn: 229645	2015-02-18 07:59:20 +00:00
Chandler Carruth	55553f5299	[x86] Rewrite the byte shift detection to not use boolean variables to track state. I didn't like this in the code review because the pattern tends to be error prone, but I didn't see a clear way to rewrite it. Turns out that there were bugs here, I found them when fuzz testing our shuffle lowering for correctness on x86. The core of the problem is that we need to consistently test all our preconditions for the same directionality of shift and the same input vector. Instead, formulate this as two predicates (one doesn't depend on the input in any way), pass things like the directionality and input vector as inputs, and loop over the alternatives. This fixes a pattern of very rare miscompiles coming out of this code. Turned up roughly 4 out of every 1 million v8 shuffles in my fuzz testing. The new code is over half a million test runs with no failures yet. I've also fuzzed every other function in the lowering code with over 3.5 million test cases and not discovered any other miscompiles. llvm-svn: 229642	2015-02-18 07:13:48 +00:00
Craig Topper	1348f17205	[X86] Remove AVX512 pslldq/psrldq shift intrinsics. They aren't implemented yet and when they are they should be done with shuffles like SSE2 and AVX2. llvm-svn: 229641	2015-02-18 06:24:49 +00:00
Craig Topper	b324e43aed	[X86] Remove AVX2 and SSE2 pslldq and psrldq intrinsics. We can represent them in IR with vector shuffles now. All their uses have been removed from clang in favor of shuffles. llvm-svn: 229640	2015-02-18 06:24:44 +00:00
Saleem Abdulrasool	90b1d152b5	Analysis: fix buildbots This should fix the compilation failure on the MSVC buildbots which find a std::make_unique and llvm::make_unique via ADL, resulting in ambiguity. llvm-svn: 229638	2015-02-18 05:09:50 +00:00
Adam Nemet	85fd9f8d09	[LoopAccesses] Change LAA:getInfo to return a constant reference As expected, this required a few more const-correctness fixes. Based on Hal's feedback on D7684. llvm-svn: 229634	2015-02-18 03:44:33 +00:00
Adam Nemet	75bc2d111f	[LoopAccesses] Add -analyze support The LoopInfo in combination with depth_first is used to enumerate the loops. Right now -analyze is not yet complete. It only prints the result of the analysis, the report and the run-time checks. Printing the unsafe depedences will require a bit more reshuffling which I'd like to do in a follow-on to this patchset. Unsafe dependences are currently checked via -debug-only=loop-accesses in the new test. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229633	2015-02-18 03:44:30 +00:00
Adam Nemet	d7350dbb85	[LoopAccesses] Split out LoopAccessReport from VectorizerReport The only difference between these two is that VectorizerReport adds a vectorizer-specific prefix to its messages. When LAA is used in the vectorizer context the prefix is added when we promote the LoopAccessReport into a VectorizerReport via one of the constructors. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229632	2015-02-18 03:44:25 +00:00
Adam Nemet	8b12afbeee	[LoopAccesses] Add missing const to APIs in VectorizationReport When I split out LoopAccessReport from this, I need to create some temps so constness becomes necessary. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229631	2015-02-18 03:44:20 +00:00
Adam Nemet	450d417ecf	[LoopAccesses] Add canAnalyzeLoop This allows the analysis to be attempted with any loop. This feature will be used with -analysis. (LV only requests the analysis on loops that have already satisfied these tests.) This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229630	2015-02-18 03:44:08 +00:00
Adam Nemet	a8945b7790	[LoopAccesses] Factor out RuntimePointerCheck::needsChecking Will be used by the new RuntimePointerCheck::print. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229629	2015-02-18 03:43:58 +00:00
Adam Nemet	d0db4c1395	[LoopAccesses] Change debug messages from LV to LAA Also add pass name as an argument to VectorizationReport::emitAnalysis. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229628	2015-02-18 03:43:37 +00:00
Adam Nemet	d6b7e29815	[LoopAccesses] Create the analysis pass This is a function pass that runs the analysis on demand. The analysis can be initiated by querying the loop access info via LAA::getInfo. It either returns the cached info or runs the analysis. Symbolic stride information continues to reside outside of this analysis pass. We may move it inside later but it's not a priority for me right now. The idea is that Loop Distribution won't support run-time stride checking at least initially. This means that when querying the analysis, symbolic stride information can be provided optionally. Whether stride information is used can invalidate the cache entry and rerun the analysis. Note that if the loop does not have any symbolic stride, the entry should be preserved across Loop Distribution and LV. Since currently the only user of the pass is LV, I just check that the symbolic stride information didn't change when using a cached result. On the LV side, LoopVectorizationLegality requests the info object corresponding to the loop from the analysis pass. A large chunk of the diff is due to LAI becoming a pointer from a reference. A test will be added as part of the -analyze patch. Also tested that with AVX, we generate identical assembly output for the testsuite (including the external testsuite) before and after. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229626	2015-02-18 03:43:24 +00:00
Adam Nemet	01abb2c355	[LoopAccesses] Make blockNeedsPredication static blockNeedsPredication is in LoopAccess in order to share it with the vectorizer. It's a utility needed by LoopAccess not strictly provided by it but it's a good place to share it. This makes the function static so that it no longer required to create an LoopAccessInfo instance in order to access it from LV. This was actually causing problems because it would have required creating LAI much earlier that LV::canVectorizeMemory(). This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229625	2015-02-18 03:43:19 +00:00
Adam Nemet	3cf32ad6db	[LoopAccesses] Cache the result of canVectorizeMemory LAA will be an on-demand analysis pass, so we need to cache the result of the analysis. canVectorizeMemory is renamed to analyzeLoop which computes the result. canVectorizeMemory becomes the query function for the cached result. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229624	2015-02-18 03:42:57 +00:00
Adam Nemet	5474be2c80	[LoopAccesses] Stash the report from the analysis rather than emitting it The transformation passes will query this and then emit them as part of their own report. The currently only user LV is modified to do just that. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229623	2015-02-18 03:42:50 +00:00
Adam Nemet	4f3ede5a01	[LoopAccesses] Make VectorizerParams global As LAA is becoming a pass, we can no longer pass the params to its constructor. This changes the command line flags to have external storage. These can now be accessed both from LV and LAA. VectorizerParams is moved out of LoopAccessInfo in order to shorten the code to access it. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229622	2015-02-18 03:42:43 +00:00
Adam Nemet	30f16e1696	[LoopAccesses] Rename LoopAccessAnalysis to LoopAccessInfo LoopAccessAnalysis will be used as the name of the pass. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229621	2015-02-18 03:42:35 +00:00
Akira Hatanaka	1defd5afbd	[InstCombine] Do not insert a GEP instruction before a landingpad instruction. InstCombiner::visitGetElementPtrInst was using getFirstNonPHI to compute the insertion point, which caused the verifier to complain when a GEP was inserted before a landingpad instruction. This commit fixes it to use getFirstInsertionPt instead. rdar://problem/19394964 llvm-svn: 229619	2015-02-18 03:30:11 +00:00
Hal Finkel	4393559621	[BDCE] Don't forget uses of root instructions seen before the instruction itself When visiting the initial list of "root" instructions (those which must always be alive), for those that are integer-valued (such as invokes returning an integer), we mark their bits as (initially) all dead (we might, obviously, find uses of those bits later, but all bits are assumed dead until proven otherwise). Don't do so, however, if we're already seen a use of those bits by another root instruction (such as a store). Fixes a miscompile of the sanitizer unit tests on x86_64. Also, add a debug line for visiting the root instructions, and remove a debug line which tried to print instructions being removed (printing dead instructions is dangerous, and can sometimes crash). llvm-svn: 229618	2015-02-18 03:12:28 +00:00
Matt Arsenault	0ba644b66b	R600/SI: Rename dst encoding field to be consistent with docs The docs call this vdst instead of just dst. llvm-svn: 229614	2015-02-18 02:15:37 +00:00
Matt Arsenault	e3dbcf6656	R600/SI: Consistently capitalize encoding field names Some formats capitalized these, but most didn't. Change them all to be consistently lowercase. Now, non-encoding fields and convenience bits are capitalized. Also remove weird looking empty line in some of the formats. llvm-svn: 229613	2015-02-18 02:15:35 +00:00
Matt Arsenault	1ecac06a6f	R600/SI: Set noNamedPositionallyEncodedOperands llvm-svn: 229612	2015-02-18 02:15:32 +00:00
Matt Arsenault	096ec1e10c	R600/SI: Fix src1_modifiers for class instructions src1 doesn't have modifiers, but the operand was missing resulting in an encoding build error when all fields are required.' llvm-svn: 229611	2015-02-18 02:15:30 +00:00
Matt Arsenault	65fa1c425d	R600/SI: Fix not setting clamp / omod for v_cndmask_b32_e64 Rename the multiclass since it now applies to the output modifiers as well. llvm-svn: 229610	2015-02-18 02:15:27 +00:00
Matt Arsenault	284d7dfb53	R600: Fix operand encoding error llvm-svn: 229609	2015-02-18 02:10:42 +00:00
Matt Arsenault	1991f5e40b	R600/SI: Fix encoding error from glc bit on VI SMRD instructions llvm-svn: 229608	2015-02-18 02:10:40 +00:00
Matt Arsenault	e6c5241814	R600/SI: Fix operand encoding for flat instructions llvm-svn: 229607	2015-02-18 02:10:37 +00:00
Matt Arsenault	07e3bb153f	R600/SI: Fix error from vdst on no return atomics Set the ignored field to 0 so we can enable noNamedPositionallyEncodedOperands. llvm-svn: 229606	2015-02-18 02:10:35 +00:00
Matt Arsenault	caa1288fff	R600/SI: Add missing offset operand to buffer bothen llvm-svn: 229605	2015-02-18 02:04:38 +00:00
Matt Arsenault	2ad8bab7ee	R600/SI: Add missing soffset operand to global atomics llvm-svn: 229604	2015-02-18 02:04:35 +00:00
Matt Arsenault	3c34ae293c	R600/SI: Fix brace identation llvm-svn: 229603	2015-02-18 02:04:31 +00:00
Justin Bogner	2b6c537bdc	Re-apply "InstrProf: Add unit tests for the profile reader and writer" Have the InstrProfWriter return a MemoryBuffer instead of a std::string. This fixes the alignment issues the reader would hit, and it's a more appropriate type for this anyway. I've also removed an ugly helper function that's not needed since we're allowing initializer lists now, and updated some error code checks based on MSVC's issues with r229473. This reverts r229483, reapplying r229478. llvm-svn: 229602	2015-02-18 01:58:17 +00:00
Matthias Braun	11042c8523	LiveRangeCalc: Rename some parameters from kill to use, NFC. Those parameters did not necessarily describe kill points but just uses. llvm-svn: 229601	2015-02-18 01:50:52 +00:00
Sanjoy Das	4153f47026	Generalize getExtendAddRecStart to work with both sign and zero extensions. This change also removes `DEBUG(dbgs() << "SCEV: untested prestart overflow check\n");` because that case has a unit test now. Differential Revision: http://reviews.llvm.org/D7645 llvm-svn: 229600	2015-02-18 01:47:07 +00:00
Eric Christopher	8af49b3214	Make the Mips AsmPrinter independent of global subtarget initialization. Initialize the subtarget once per function and migrate EmitStartOfAsmFile to either use calls on the TargetMachine or get information from the subtarget we'd use for assembling. The top-level-ness of the MIPS attribute output for assembly is, by nature, contrary to how we'd want to do this for an LTO situation where we have multiple cpu architectures so this solution is good enough for now. llvm-svn: 229596	2015-02-18 01:01:57 +00:00
Eric Christopher	bbe6ff50f3	Unify selectMipsCPU implementations. llvm-svn: 229595	2015-02-18 00:55:06 +00:00
Sanjoy Das	102061a494	Bugfix: SCEV incorrectly marks certain expressions as nsw I could not come up with a test case for this one; but I don't think `getPreStartForSignExtend` can assume `AR` is `nsw` -- there is one place in scalar evolution that calls `getSignExtendAddRecStart(AR, ...)` without proving that `AR` is `nsw` (line 1564) OperandExtendedAdd = getAddExpr(WideStart, getMulExpr(WideMaxBECount, getZeroExtendExpr(Step, WideTy))); if (SAdd == OperandExtendedAdd) { // If AR wraps around then // // abs(Step) * MaxBECount > unsigned-max(AR->getType()) // => SAdd != OperandExtendedAdd // // Thus (AR is not NW => SAdd != OperandExtendedAdd) <=> // (SAdd == OperandExtendedAdd => AR is NW) const_cast<SCEVAddRecExpr *>(AR)->setNoWrapFlags(SCEV::FlagNW); // Return the expression with the addrec on the outside. return getAddRecExpr(getSignExtendAddRecStart(AR, Ty, this), getZeroExtendExpr(Step, Ty), L, AR->getNoWrapFlags()); } Differential Revision: http://reviews.llvm.org/D7640 llvm-svn: 229594	2015-02-18 00:43:19 +00:00
Rafael Espindola	2d7ec9a860	Twines should be passed by const ref. llvm-svn: 229590	2015-02-17 23:44:22 +00:00
Andrea Di Biagio	e7b58ee555	[X86][FastIsel] Teach how to select scalar integer to float/double conversions. This patch teaches fast-isel how to select a (V)CVTSI2SSrr for an integer to float conversion, and how to select a (V)CVTSI2SDrr for an integer to double conversion. Added test 'fast-isel-int-float-conversion.ll'. Differential Revision: http://reviews.llvm.org/D7698 llvm-svn: 229589	2015-02-17 23:40:58 +00:00
Rafael Espindola	df19519800	Add r228939 back with a fix. The problem in the original patch was not switching back to .text after printing an eh table. Original message: On ELF, put PIC jump tables in a non executable section. Fixes PR22558. llvm-svn: 229586	2015-02-17 23:34:51 +00:00
Sanjay Patel	e951a3839a	rename variables again because these tables also deal with stores; NFC Suggestion by Simon Pilgrim llvm-svn: 229574	2015-02-17 22:38:06 +00:00
Duncan P. N. Exon Smith	a55dcaf427	IR: fieldIsMDNode() should be false for MDString Simplify the code. It has been a while since the schema has been so "flexible". llvm-svn: 229573	2015-02-17 22:34:15 +00:00
Duncan P. N. Exon Smith	57bab0bc97	AsmPrinter: Take range in DwarfExpression::AddExpression(), NFC Previously `DwarfExpression::AddExpression()` relied on default-constructing the end iterators for `DIExpression` -- once the operands are represented explicitly via `MDExpression` (instead of via the strange `StringRef` navigator in `DIHeaderIterator`) this won't work. Explicitly take an iterator for the end of the range. llvm-svn: 229572	2015-02-17 22:30:56 +00:00
Simon Pilgrim	1d89a02abb	[X86][SSE] Generalised unpckl/unpckh shuffle matching Added commuted unpckl/unpckh shuffle matching patterns as many cases containing undefined lanes fail to commute by themselves. Differential Revision: http://reviews.llvm.org/D7564 llvm-svn: 229571	2015-02-17 22:24:32 +00:00
Sanjay Patel	1a20fdf36f	Add comment to explain a non-obvious setting; NFC. This is paraphrased from Simon Pilgrim's comment in: http://reviews.llvm.org/D7492 llvm-svn: 229566	2015-02-17 22:09:54 +00:00
Sanjay Patel	203ee500e9	remove function names from comments; NFC llvm-svn: 229558	2015-02-17 21:55:20 +00:00
Sanjay Patel	52f9f7c0f3	replace meaningless variable names; NFCI llvm-svn: 229549	2015-02-17 21:37:28 +00:00
Rafael Espindola	68fa249cb5	Add r228980 back. Add support for having multiple sections with the same name and comdat. Using this in combination with -ffunction-sections allows LLVM to output a .o file with mulitple sections named .text. This saves space by avoiding long unique names of the form .text.<C++ mangled name>. llvm-svn: 229541	2015-02-17 20:48:01 +00:00
Rafael Espindola	7fe7e05379	Add r228889 back. Original message: Invert the section relocation map. It now points from rel section to section. Use it to set sh_info, avoiding a brittle name lookup. llvm-svn: 229539	2015-02-17 20:40:59 +00:00
Rafael Espindola	3a7c0eb32e	Add r228888 back. Original message: Use the existing SymbolTableIndex instead of doing a lookup. NFC. llvm-svn: 229538	2015-02-17 20:37:50 +00:00
Rafael Espindola	ead8549cab	Add r228886 back now that r229530 fixed the issue lldb was hitting. Original message: Create the Seciton -> Rel Section map when it is first needed. NFC. Saves a walk over every section. llvm-svn: 229536	2015-02-17 20:31:13 +00:00
Tom Stellard	7b3aa88ac1	R600/SI: Fix asam errors in SIFoldOperands We were trying to fold into implicit uses, which led to out of bounds access of the MCInstrDesc::OpInfo arrray. llvm-svn: 229533	2015-02-17 20:11:54 +00:00
Sanjay Patel	b811c1d6a5	prevent folding a scalar FP load into a packed logical FP instruction (PR22371) Change the memory operands in sse12_fp_packed_scalar_logical_alias from scalars to vectors. That's what the hardware packed logical FP instructions define: 128-bit memory operands. There are no scalar versions of these instructions...because this is x86. Generating the wrong code (folding a scalar load into a 128-bit load) is still possible using the peephole optimization pass and the load folding tables. We won't completely solve this bug until we either fix the lowering in fabs/fneg/fcopysign and any other places where scalar FP logic is created or fix the load folding in foldMemoryOperandImpl() to make sure it isn't changing the size of the load. Differential Revision: http://reviews.llvm.org/D7474 llvm-svn: 229531	2015-02-17 20:08:21 +00:00
Rafael Espindola	127b6c3ba7	Don't deference the section_end() iterator. Hard to test given the undefined behavior nature. llvm-svn: 229530	2015-02-17 20:07:28 +00:00
Eric Christopher	a49d68e078	Make the ARM AsmPrinter independent of global subtarget initialization. Initialize the subtarget once per function and migrate Emit{Start\|End}OfAsmFile to either use attributes on the TargetMachine or get information from the subtarget we'd use for assembling. One bit (getISAEncoding) touched the general AsmPrinter and the debug output. Handle this one by passing the function for the subprogram down and updating all callers and users. The top-level-ness of the ARM attribute output for assembly is, by nature, contrary to how we'd want to do this for an LTO situation where we have multiple cpu architectures so this solution is good enough for now. llvm-svn: 229528	2015-02-17 20:02:32 +00:00
Eric Christopher	ffc5ff32d1	80-column fixups. llvm-svn: 229527	2015-02-17 20:02:28 +00:00
Adrian Prantl	ea7f1c2d19	DIBuilder: add trackIfUnresolved() to all nodes that may be cyclic. Tested in clang/test/CodeGenObjCCXX/debug-info-cyclic.mm rdar://problem/19839612 llvm-svn: 229521	2015-02-17 19:17:39 +00:00
Simon Atanasyan	1d902b7cc7	[Object] Support reading 64-bit MIPS ELF archives The 64-bit MIPS ELF archive file format is used by MIPS64 targets. The main difference from a regular archive file is the symbol table format: 1. ar_name is equal to "/SYM64/" 2. number of symbols and offsets are 64-bit integers http://techpubs.sgi.com/library/manuals/4000/007-4658-001/pdf/007-4658-001.pdf Page 96 The patch allows reading of such archive files by llvm-nm, llvm-objdump and other tools. But it does not support archive files with number of symbols and/or offsets exceed 2^32. I think it is a rather rare case requires more significant modification of `Archive` class code. http://reviews.llvm.org/D7546 llvm-svn: 229520	2015-02-17 18:54:22 +00:00
Sanjay Patel	ab7e86e5be	Canonicalize splats as build_vectors (PR22283) This is a follow-on patch to: http://reviews.llvm.org/D7093 That patch canonicalized constant splats as build_vectors, and this patch removes the constant check so we can canonicalize all splats as build_vectors. This fixes the 2nd test case in PR22283: http://llvm.org/bugs/show_bug.cgi?id=22283 The unfortunate code duplication between SelectionDAG and DAGCombiner is discussed in the earlier patch review. At least this patch is just removing code... This improves an existing x86 AVX test and changes codegen in an ARM test. Differential Revision: http://reviews.llvm.org/D7389 llvm-svn: 229511	2015-02-17 16:54:32 +00:00
Tom Stellard	bc3776803b	R600/SI: Extend private extload pattern to include zext loads llvm-svn: 229507	2015-02-17 16:36:00 +00:00
Benjamin Kramer	6cd780ff21	Prefer SmallVector::append/insert over push_back loops. Same functionality, but hoists the vector growth out of the loop. llvm-svn: 229500	2015-02-17 15:29:18 +00:00
Elena Demikhovsky	ef035bb974	Fixed a bug in store sinking. The problem was in store-sink barrier check. Store sink barrier should be checked for ModRef (read-write) mode. http://llvm.org/bugs/show_bug.cgi?id=22613 llvm-svn: 229495	2015-02-17 13:10:05 +00:00
NAKAMURA Takumi	cd5a3673c3	OrcJIT: Appease msc18 not to be confused on executeCompileCallback<OrcX86_64>. llvm-svn: 229494	2015-02-17 12:53:16 +00:00
NAKAMURA Takumi	3e087357ce	Reformat. llvm-svn: 229493	2015-02-17 12:53:05 +00:00
Andrea Di Biagio	eb97f92489	[X86] Silence -Wsign-compare warnings. GCC 4.8 reported two new warnings due to comparisons between signed and unsigned integer expressions. The new warnings were accidentally introduced by revision 229480. Added explicit casts to silence the warnings. No functional change intended. llvm-svn: 229488	2015-02-17 11:20:11 +00:00
Justin Bogner	5b3ad88646	Revert "InstrProf: Add unit tests for the profile reader and writer" This added API to the InstrProfWriter to write to a string so I could write unittests without using temp files. This doesn't really work, since the format has tighter alignment requirements than a char. This reverts r229478 and its follow-up, r229481. llvm-svn: 229483	2015-02-17 09:21:43 +00:00
Elena Demikhovsky	ba84672519	AVX-512: changes in intel_ocl_bi calling conventions - added mask types v8i1 and v16i1 to possible function parameters - enabled passing 512-bit vectors in standard CC - added a test for KNL intel_ocl_bi conventions llvm-svn: 229482	2015-02-17 09:20:12 +00:00
Michael Kuperstein	ff5acaf50c	[X86] Combine vector anyext + and into a vector zext Vector zext tends to get legalized into a vector anyext, represented as a vector shuffle with an undef vector + a bitcast, that gets ANDed with a mask that zeroes the undef elements. Combine this into an explicit shuffle with a zero vector instead. This allows shuffle lowering to match it as a zext, instead of matching it as an anyext and emitting an explicit AND. This combine only covers a subset of the cases, but it's a start. Differential Revision: http://reviews.llvm.org/D7666 llvm-svn: 229480	2015-02-17 08:22:51 +00:00
Justin Bogner	218d0689a9	Re-apply "InstrProf: Add unit tests for the profile reader and writer" Add these tests again, but use va_list instead of initializer lists. This reverts r229456, reapplying r229455. llvm-svn: 229478	2015-02-17 07:50:59 +00:00
Eric Christopher	5c0e009d3a	Make the PowerPC AsmPrinter independent of global subtarget initialization. Initialize the subtarget once per function and migrate EmitStartOfAsmFile to either use attributes on the TargetMachine or get information from all of the various subtargets. llvm-svn: 229475	2015-02-17 07:21:21 +00:00
Eric Christopher	75dc3904a5	Add a FIXME to move IsLittleEndian to the target machine. llvm-svn: 229472	2015-02-17 06:45:17 +00:00
Eric Christopher	fee6aaf683	Move ABI handling and 64-bitness to the PowerPC target machine. This required changing how the computation of the ABI is handled and how some of the checks for ABI/target are done. llvm-svn: 229471	2015-02-17 06:45:15 +00:00
Duncan P. N. Exon Smith	752d6df22d	AsmPrinter: Use DIExpression default constructor, NFC llvm-svn: 229464	2015-02-17 02:42:45 +00:00
Chandler Carruth	55db07016e	[x86] Teach the unpack lowering to try wider element unpacks. This allows it to match still more places where previously we would have to fall back on floating point shuffles or other more complex lowering strategies. I'm hoping to replace some of the hand-rolled unpack matching with this routine is it gets more and more clever. llvm-svn: 229463	2015-02-17 02:12:24 +00:00
Hal Finkel	2bb61ba2fe	[BDCE] Add a bit-tracking DCE pass BDCE is a bit-tracking dead code elimination pass. It is based on ADCE (the "aggressive DCE" pass), with the added capability to track dead bits of integer valued instructions and remove those instructions when all of the bits are dead. Currently, it does not actually do this all-bits-dead removal, but rather replaces the instruction's uses with a constant zero, and lets instcombine (and the later run of ADCE) do the rest. Because we essentially get a run of ADCE "for free" while tracking the dead bits, we also do what ADCE does and removes actually-dead instructions as well (this includes instructions newly trivially dead because all bits were dead, but not all such instructions can be removed). The motivation for this is a case like: int __attribute__((const)) foo(int i); int bar(int x) { x \|= (4 & foo(5)); x \|= (8 & foo(3)); x \|= (16 & foo(2)); x \|= (32 & foo(1)); x \|= (64 & foo(0)); x \|= (128& foo(4)); return x >> 4; } As it turns out, if you order the bit-field insertions so that all of the dead ones come last, then instcombine will remove them. However, if you pick some other order (such as the one above), the fact that some of the calls to foo() are useless is not locally obvious, and we don't remove them (without this pass). I did a quick compile-time overhead check using sqlite from the test suite (Release+Asserts). BDCE took ~0.4% of the compilation time (making it about twice as expensive as ADCE). I've not looked at why yet, but we eliminate instructions due to having all-dead bits in: External/SPEC/CFP2006/447.dealII/447.dealII External/SPEC/CINT2006/400.perlbench/400.perlbench External/SPEC/CINT2006/403.gcc/403.gcc MultiSource/Applications/ClamAV/clamscan MultiSource/Benchmarks/7zip/7zip-benchmark llvm-svn: 229462	2015-02-17 01:36:59 +00:00
Lang Hames	2754714fb9	[Orc] Update the Orc indirection utils and refactor the CompileOnDemand layer. This patch replaces most of the Orc indirection utils API with a new class: JITCompileCallbackManager, which creates and manages JIT callbacks. Exposing this functionality directly allows the user to create callbacks that are associated with user supplied compilation actions. For example, you can create a callback to lazyily IR-gen something from an AST. (A kaleidoscope example demonstrating this will be committed shortly). This patch also refactors the CompileOnDemand layer to use the JITCompileCallbackManager API. llvm-svn: 229461	2015-02-17 01:18:38 +00:00
Duncan P. N. Exon Smith	b474937929	AsmPrinter: Stop creating DebugLocs While looking at a heap profile of a clang LTO bootstrap with -g, I noticed that 2.2% of memory in an `llvm-lto` of clang is from calling `DebugLoc::get()` in `collectVariableInfo()` (accounting for ~40% of memory used for `MDLocation`s). I suspect this was introduced by r226736, whose goal was to prevent uniquing of `DebugLoc`s (goal achieved, if so). There's no reason we need a `DebugLoc` here at all -- it was just being used for (in)convenient API -- so the fix is to pass the scope and inlined-at directly to `LexicalScopes::findInlinedScope()`. llvm-svn: 229459	2015-02-17 00:02:27 +00:00
Hal Finkel	5cedafb8cd	[PowerPC] Support non-direct-sub/superclass VSX copies Our register allocation has become better recently, it seems, and is now starting to generate cross-block copies into inflated register classes. These copies are not transformed into subregister insertions/extractions by the PPCVSXCopy class, and so need to be handled directly by PPCInstrInfo::copyPhysReg. The code to do this was almost there, but not quite (it was unnecessarily restricting itself to only the direct sub/super-register-class case (not copying between, for example, something in VRRC and the lower-half of VSRC which are super-registers of F8RC). Triggering this behavior manually is difficult; I'm including two bugpoint-reduced test cases from the test suite. llvm-svn: 229457	2015-02-16 23:46:30 +00:00
Justin Bogner	fcb2de694a	Revert "InstrProf: Add unit tests for the profile reader and writer" Looks like the bots don't like my initializer lists. This reverts r229455 llvm-svn: 229456	2015-02-16 23:31:07 +00:00
Justin Bogner	f83e895fa7	InstrProf: Add unit tests for the profile reader and writer This required some minor API to be added to these types to avoid needing temp files. Also, I've used initializer lists in the tests, as MSVC 2013 claims to support them. I'll redo this without them if the bots complain. llvm-svn: 229455	2015-02-16 23:27:48 +00:00
Simon Atanasyan	79ba8407d2	[Mips] Add .MIPS.options section descriptor kinds enumeration No functional changes. llvm-svn: 229452	2015-02-16 22:59:29 +00:00
Ahmed Bougacha	bf2b90e92d	[ARM] Remove unused declaration. NFC. GlobalMerge was moved to lib/CodeGen a while ago, and is no longer called "ARMGlobalMerge". llvm-svn: 229448	2015-02-16 22:30:08 +00:00
Cameron McInally	c5764cbe4e	[AVX512] Make 512b vector floating point rounds legal on AVX512. llvm-svn: 229445	2015-02-16 22:15:42 +00:00
Matthias Braun	15635c5f85	RegisterCoalescer: Don't rematerialize subregister definitions. We cannot simply rematerialize instructions which only defining a subregister, as the final value also depends on the previous instructions. This fixes test/CodeGen/R600/subreg-coalescer-bug.ll with subreg liveness enabled. llvm-svn: 229444	2015-02-16 22:05:17 +00:00
Matthias Braun	1b901a4435	RegisterCoalescer: Do not look for regclass of IMPLICIT_DEF. IMPLICIT_DEF is a generic instruction and has no (fixed) output register class defined. The rematerialization code of the register coalescer should not scan the instruction description for a register class. This fixes a problem showing up in test/CodeGen/R600/subreg-coalescer-crash.ll with subregister liveness enabled. llvm-svn: 229443	2015-02-16 22:05:12 +00:00
Simon Pilgrim	b2c00f3286	[X86][SSE] Add SSE MOVQ instructions to SSEPackedInt domain Patch to explicitly add the SSE MOVQ (rr,mr,rm) instructions to SSEPackedInt domain - prevents a number of costly domain switches. Differential Revision: http://reviews.llvm.org/D7600 llvm-svn: 229439	2015-02-16 21:50:56 +00:00
Mehdi Amini	3e0023b8f6	SelectionDAG: fold (fp_to_u/sint (s/uint_to_fp)) here too Update SPARC tests to match. From: Fiona Glaser <fglaser@apple.com> llvm-svn: 229438	2015-02-16 21:47:58 +00:00
Mehdi Amini	b9a0fa4822	InstCombine: fold more cases of (fp_to_u/sint (u/sint_to_fp val)) Fixes radar 15486701. From: Fiona Glaser <fglaser@apple.com> llvm-svn: 229437	2015-02-16 21:47:54 +00:00
Justin Bogner	ab89ed7dd5	InstrProf: Use ErrorOr for IndexedInstrProfReader::create (NFC) The other InstrProfReader::create factories were updated to return ErrorOr in r221120, and it's odd for these APIs not to match. llvm-svn: 229433	2015-02-16 21:28:58 +00:00

... 2 3 4 5 6 ...

77256 Commits