llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	3dc37cc592	[X86] Add a bunch of -mcpu strings to the cpus.ll test. We were missing most of the "core" aliases as well as skylake, cannonlake, and knights landing. llvm-svn: 315606	2017-10-12 18:55:57 +00:00
Artem Belevich	3bafc2f0d9	[NVPTX] Implemented wmma intrinsics and instructions. WMMA = "Warp Level Matrix Multiply-Accumulate". These are the new instructions introduced in PTX6.0 and available on sm_70 GPUs. Differential Revision: https://reviews.llvm.org/D38645 llvm-svn: 315601	2017-10-12 18:27:55 +00:00
Reid Kleckner	1a7e387849	[codeview] Don't emit FPO data in funclet prologues Attempt 3 to work around bugs in FPO data with funclets. llvm-svn: 315600	2017-10-12 18:20:35 +00:00
Justin Bogner	754a1a8a6f	llvm-isel-fuzzer: Work around BUILD_SHARED_LIBS testing issues Building with BUILD_SHARED_LIBS makes it tricky to copy around executables at will, since they won't be able to find the LLVM libraries any more. This makes testing a feature that's based on the executable name problematic, so we'll just disable these two tests in that configuration. We could potentially fix this by symlinking the lib directory into the test directory, but that wouldn't work on windows, and losing testing on windows would be far worse than losing testing on a configuration that's barely even supported. llvm-svn: 315599	2017-10-12 18:10:22 +00:00
Artem Belevich	786ca6a166	[TableGen] Allow intrinsics to have up to 8 return values. Differential Revision: https://reviews.llvm.org/D38633 llvm-svn: 315598	2017-10-12 17:40:00 +00:00
Sanjay Patel	e272be7c9a	[ValueTracking] return zero when there's conflict in known bits of a shift (PR34838) Poison allows us to return a better result than undef. llvm-svn: 315595	2017-10-12 17:31:46 +00:00
Bruno Cardoso Lopes	326fdcbff8	Reintroduce "[SCCP] Propagate integer range info for parameters in IPSCCP." This is r315288 & r315294, which were reverted due to stage2 bot failures. Summary: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to `ret i32 2` with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 315593	2017-10-12 16:54:11 +00:00
Lei Huang	0724fea2da	[PowerPC] Add profitablilty check for conversion to mtctr loops Add profitability checks for modifying counted loops to use the mtctr instruction. The latency of mtctr is only justified if there are more than 4 comparisons that will be removed as a result. Usually counted loops are formed relatively early and before unrolling, so most low trip count loops often don't survive. However we want to ensure that if they do, we do not mistakenly update them to mtctr loops. Use CodeMetrics to ensure we are only doing this for small loops with small trip counts. Differential Revision: https://reviews.llvm.org/D38212 llvm-svn: 315592	2017-10-12 16:43:33 +00:00
Tim Renouf	c8ffffe462	[AMDGPU] For amdpal, widen interpolation mode workaround Summary: The interpolation mode workaround ensures that at least one interpolation mode is enabled in PSInputAddr. It does not also check PSInputEna on the basis that the user might enable bits in that depending on run-time state. However, for amdpal os type, the user does not enable some bits after compilation based on run-time states; the register values being generated here are the final ones set in the hardware. Therefore, apply the workaround to PSInputAddr and PSInputEnable together. (The case where a bit is set in PSInputAddr but not in PSInputEnable is where the frontend set up an input arg for a particular interpolation mode, but nothing uses that input arg. Really we should have an earlier pass that removes such an arg.) Reviewers: arsenm, nhaehnle, dstuttard Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D37758 llvm-svn: 315591	2017-10-12 16:16:41 +00:00
Mikael Holmen	a079ef68e3	[RegisterCoalescer] Don't set read-undef in pruneValues, only clear Summary: The comments in the code said // Remove <def,read-undef> flags. This def is now a partial redef. but the code didn't just remove read-undef, it could introduce new ones which could cause errors. E.g. if we have something like %vreg1<def> = IMPLICIT_DEF %vreg2:subreg1<def, read-undef> = op %vreg3, %vreg4 %vreg2:subreg2<def> = op %vreg6, %vreg7 and we merge %vreg1 and %vreg2 then we should not set undef on the second subreg def, which the old code did. Now we solve this by actually do what the code comment says. We remove read-undef flags rather than remove or introduce them. Reviewers: qcolombet, MatzeB Reviewed By: MatzeB Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38616 llvm-svn: 315564	2017-10-12 06:21:28 +00:00
Justin Bogner	9ea7fbd1e8	Re-commit "llvm-isel-fuzzer: Handle a subset of backend flags in the exec name" Here we add a secondary option parser to llvm-isel-fuzzer (and provide it for use with other fuzzers). With this, you can copy the fuzzer to a name like llvm-isel-fuzzer=aarch64-gisel for a fuzzer that fuzzer AArch64 with GlobalISel enabled, or fuzzer=x86_64 to fuzz x86, with no flags required. This should be useful for running these in OSS-Fuzz. Note that this handrolls a subset of cl::opts to recognize, rather than embedding a complete command parser for argv[0]. If we find we really need the flexibility of handling arbitrary options at some point we can rethink this. This re-applies 315545 using "=" instead of ":" as a separator for arguments. llvm-svn: 315557	2017-10-12 04:35:32 +00:00
Hans Wennborg	022829d84c	Revert r315545 "llvm-isel-fuzzer: Handle a subset of backend flags in the executable name" It broke some tests on Windows: Failing Tests (4): LLVM :: tools/llvm-isel-fuzzer/execname-options.ll LLVM :: tools/llvm-isel-fuzzer/missing-triple.ll LLVM :: tools/llvm-isel-fuzzer/x86-empty-bc.ll LLVM :: tools/llvm-isel-fuzzer/x86-empty.ll > llvm-isel-fuzzer: Handle a subset of backend flags in the executable name > > Here we add a secondary option parser to llvm-isel-fuzzer (and provide > it for use with other fuzzers). With this, you can copy the fuzzer to > a name like llvm-isel-fuzzer:aarch64-gisel for a fuzzer that fuzzer > AArch64 with GlobalISel enabled, or fuzzer:x86_64 to fuzz x86, with no > flags required. This should be useful for running these in OSS-Fuzz. > > Note that this handrolls a subset of cl::opts to recognize, rather > than embedding a complete command parser for argv[0]. If we find we > really need the flexibility of handling arbitrary options at some > point we can rethink this. llvm-svn: 315554	2017-10-12 03:32:09 +00:00
Hongbin Zheng	d36f2030e2	[SimplifyIndVar] Replace IVUsers with loop invariant whenever possible Differential Revision: https://reviews.llvm.org/D38415 llvm-svn: 315551	2017-10-12 02:54:11 +00:00
Justin Bogner	a5969ce15f	llvm-isel-fuzzer: Handle a subset of backend flags in the executable name Here we add a secondary option parser to llvm-isel-fuzzer (and provide it for use with other fuzzers). With this, you can copy the fuzzer to a name like llvm-isel-fuzzer:aarch64-gisel for a fuzzer that fuzzer AArch64 with GlobalISel enabled, or fuzzer:x86_64 to fuzz x86, with no flags required. This should be useful for running these in OSS-Fuzz. Note that this handrolls a subset of cl::opts to recognize, rather than embedding a complete command parser for argv[0]. If we find we really need the flexibility of handling arbitrary options at some point we can rethink this. llvm-svn: 315545	2017-10-12 01:57:49 +00:00
Wei Mi	1736efd16a	Revert r307036 because of PR34919. llvm-svn: 315540	2017-10-12 00:24:52 +00:00
Konstantin Zhuravlyov	c3beb6a075	AMDGPU/NFC: Minor clean ups in PAL metadata - Move PAL metadata definitions to AMDGPUMetadata - Make naming consistent with HSA metadata Differential Revision: https://reviews.llvm.org/D38745 llvm-svn: 315523	2017-10-11 22:41:09 +00:00
Konstantin Zhuravlyov	a63b0f9d20	AMDGPU/NFC: Rename code object metadata as HSA metadata - Rename AMDGPUCodeObjectMetadata to AMDGPUMetadata (PAL metadata will be included in this file in the follow up change) - Rename AMDGPUCodeObjectMetadataStreamer to AMDGPUHSAMetadataStreamer - Introduce HSAMD namespace - Other minor name changes in function and test names llvm-svn: 315522	2017-10-11 22:18:53 +00:00
Reid Kleckner	ddf413f3e1	Really fix llvm-rc include-paths.test llvm-svn: 315515	2017-10-11 21:27:54 +00:00
Reid Kleckner	ade90cbd79	Attempt to fix failing llvm-rc include-paths.text llvm-svn: 315514	2017-10-11 21:25:03 +00:00
Reid Kleckner	9cdd4df81a	[codeview] Implement FPO data assembler directives Summary: This adds a set of new directives that describe 32-bit x86 prologues. The directives are limited and do not expose the full complexity of codeview FPO data. They are merely a convenience for the compiler to generate more readable assembly so we don't need to generate tons of labels in CodeGen. If our prologue emission changes in the future, we can change the set of available directives to suit our needs. These are modelled after the .seh_ directives, which use a different format that interacts with exception handling. The directives are: .cv_fpo_proc _foo .cv_fpo_pushreg ebp/ebx/etc .cv_fpo_setframe ebp/esi/etc .cv_fpo_stackalloc 200 .cv_fpo_endprologue .cv_fpo_endproc .cv_fpo_data _foo I tried to follow the implementation of ARM EHABI CFI directives by sinking most directives out of MCStreamer and into X86TargetStreamer. This helps avoid polluting non-X86 code with WinCOFF specific logic. I used cdb to confirm that this can show locals in parent CSRs in a few cases, most importantly the one where we use ESI as a frame pointer, i.e. the one in http://crbug.com/756153#c28 Once we have cdb integration in debuginfo-tests, we can add integration tests there. Reviewers: majnemer, hans Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D38776 llvm-svn: 315513	2017-10-11 21:24:33 +00:00
Krzysztof Parzyszek	c4a9a8d8e0	[Hexagon] Make sure that new-value jump is packetized with producer llvm-svn: 315510	2017-10-11 21:20:43 +00:00
Florian Hahn	e52abba277	[MachineCombiner] Fix initialisation of LastUpdate for incremental update. Summary: Fixes a bogus iterator resulting from the removal of a block's first instruction at the point that incremental update is enabled. Patch by Paul Walker. Reviewers: fhahn, Gerolf, efriedma, MatzeB Reviewed By: fhahn Subscribers: aemerson, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38734 llvm-svn: 315502	2017-10-11 20:25:58 +00:00
Lei Huang	263dc4ef3a	[PowerPC] Utilize DQ-Form instructions for spill/restore and fix FrameIndex elimination to only use `lis/addi` if necessary. Currently we produce a bunch of unnecessary code when emitting the prologue/epilogue for spills/restores. Namely, if the load from stack slot/store to stack slot instruction is an X-Form instruction, we will always produce an LIS/ORI sequence for the stack offset. Furthermore, we have not exploited the P9 vector D-Form loads/stores for this purpose. This patch address both issues. Specifying the D-Form load as the instruction to use for stack spills/reloads should be safe because: 1. The stack should be aligned according to the ABI 2. If the stack isn't aligned, PPCRegisterInfo::eliminateFrameIndex() will check for the offset being a multiple of 16 and will convert it to an X-Form instruction if it isn't. Differential Revision : https://reviews.llvm.org/D38758 llvm-svn: 315500	2017-10-11 20:20:58 +00:00
Zachary Turner	fa0ca6cbd0	[llvm-rc] Use proper search algorithm for finding resources. Previously we would only look in the current directory for a resource, which might not be the same as the directory of the rc file. Furthermore, MSVC rc supports a /I option, and can also look in the system environment. This patch adds support for this search algorithm. Differential Revision: https://reviews.llvm.org/D38740 llvm-svn: 315499	2017-10-11 20:12:09 +00:00
Sanjay Patel	6c0aef77aa	[x86] avoid infinite loop from SoftenFloatOperand (PR34866) Legalization of fp128 assumes things that we should have asserts for, so that's another potential improvement. Differential Revision: https://reviews.llvm.org/D38771 llvm-svn: 315485	2017-10-11 18:24:21 +00:00
Jake Ehrlich	f03384dce7	Reland "[llvm-objcopy] Add support for --strip-sections to remove all section headers leaving only program headers and loadable segment data" ubsan caught an issue I made where I was converting a null pointer to a reference. elf utils implements a particularly extreme form of stripping that I'd like to support. eu-strip has an option called "strip-sections" that removes all section headers and leaves only program headers and the segment data. I have implemented this option partly as a test but mainly because in Fuchsia we would like to use this option to minimize the size of our executables. The other strip options that are on my list include --strip-all and --strip-debug. This is a preliminary implementation that I'd like to start using in Fuchsia builds if possible. This change implements such a stripping option for llvm-objcopy Differential Revision: https://reviews.llvm.org/D38335 llvm-svn: 315484	2017-10-11 18:09:18 +00:00
Lei Huang	f9c7f7fed4	[NFC] update test case so checks are not order dependent when not needed llvm-svn: 315482	2017-10-11 18:04:41 +00:00
Rafael Espindola	1a0e5a1933	Convert an ErrorOr to Expected. getRelocationAddend should never be called on non SHT_RELA sections, but changing that requires changing RelocVisitor.h. llvm-svn: 315473	2017-10-11 16:56:33 +00:00
Krzysztof Parzyszek	8f174dde92	[Pipeliner] Improve serialization order for post-increments The pipeliner is generating a serial sequence that causes poor register allocation when a post-increment instruction appears prior to the use of the post-increment register. This occurs when there is a circular set of dependences involved with a sequence of instructions in the same cycle. In this case, there is no serialization of the parallel semantics that will not cause an additional register to be allocated. This patch fixes the problem by changing the instructions so that the post-increment instruction is used by the subsequent instruction, which enables the register allocator to make a better decision and not require another register. Patch by Brendon Cahoon. llvm-svn: 315466	2017-10-11 15:51:44 +00:00
Sanjay Patel	8d565a233d	[InstCombine] add baseline tests for D38531; NFC llvm-svn: 315461	2017-10-11 14:29:17 +00:00
Sanjay Patel	34fd5eaaf0	[DAGCombiner] convert insertelement of bitcasted vector into shuffle Eg: insert v4i32 V, (v2i16 X), 2 --> shuffle v8i16 V', X', {0,1,2,3,8,9,6,7} This is a generalization of the IR fold in D38316 to handle insertion into a non-undef vector. We may want to abandon that one if we can't find value in squashing the more specific pattern sooner. We're using the existing legal shuffle target hook to avoid AVX512 horror with vXi1 shuffles. There may be room for improvement in the shuffle lowering here, but that would be follow-up work. Differential Revision: https://reviews.llvm.org/D38388 llvm-svn: 315460	2017-10-11 14:12:16 +00:00
Jonas Devlieghere	ec053332cf	Revert "[dsymutil] Timestmap verification for __swift_ast" This reverts commit r315456. llvm-svn: 315458	2017-10-11 13:51:30 +00:00
Jonas Devlieghere	8acb2e3ac4	[dsymutil] Timestmap verification for __swift_ast This patch adds timestamp verification for swiftmodule files. - A new flag is provided to allows us to continue testing of the code for embedding the__swift_ast. (git doesn't maintain timestamps) - Adds a new test for fat (arm) binaries. Differential revision: https://reviews.llvm.org/D38686 llvm-svn: 315456	2017-10-11 13:34:52 +00:00
Simon Dardis	442ee63468	[mips] Add missing tests from rL315451 llvm-svn: 315454	2017-10-11 11:45:06 +00:00
Uriel Korach	782f28bf2f	[X86] Added tests for TESTM and TESTNM (NFC) Adding this test files now so after another commit that will add a new pattern for TESTM and TESTNM instructions will show the improvemnts that have been done. Change-Id: If3908b7f91897d764053312365a2bc1de78b291d llvm-svn: 315443	2017-10-11 08:39:25 +00:00
Max Kazantsev	3b81809e06	[GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors This patch fixes the miscompile that happens when PRE hoists loads across guards and other instructions that don't always pass control flow to their successors. PRE is now prohibited to hoist across such instructions because there is no guarantee that the load standing after such instruction is still valid before such instruction. For example, a load from under a guard may be invalid before the guard in the following case: int array[LEN]; ... guard(0 <= index && index < LEN); use(array[index]); Differential Revision: https://reviews.llvm.org/D37460 llvm-svn: 315440	2017-10-11 08:10:43 +00:00
Max Kazantsev	0c8dd052b8	[LICM] Disallow sinking of unordered atomic loads into loops Sinking of unordered atomic load into loop must be disallowed because it turns a single load into multiple loads. The relevant section of the documentation is: http://llvm.org/docs/Atomics.html#unordered, specifically the Notes for Optimizers section. Here is the full text of this section: > Notes for optimizers > In terms of the optimizer, this prohibits any transformation that > transforms a single load into multiple loads, transforms a store into > multiple stores, narrows a store, or stores a value which would not be > stored otherwise. Some examples of unsafe optimizations are narrowing > an assignment into a bitfield, rematerializing a load, and turning loads > and stores into a memcpy call. Reordering unordered operations is safe, > though, and optimizers should take advantage of that because unordered > operations are common in languages that need them. Patch by Daniil Suchkov! Reviewed By: reames Differential Revision: https://reviews.llvm.org/D38392 llvm-svn: 315438	2017-10-11 07:26:45 +00:00
Max Kazantsev	25d8655dc2	[IRCE] Do not process empty safe ranges IRCE should not apply when the safe iteration range is proved to be empty. In this case we do unneeded job creating pre/post loops and then never go to the main loop. This patch makes IRCE not apply to empty safe ranges, adds test for this situation and also modifies one of existing tests where it used to happen slightly. Reviewed By: anna Differential Revision: https://reviews.llvm.org/D38577 llvm-svn: 315437	2017-10-11 06:53:07 +00:00
Davide Italiano	e2138fe41b	[GVN] Don't replace constants with constants. This fixes PR34908. Patch by Alex Crichton! Differential Revision: https://reviews.llvm.org/D38765 llvm-svn: 315429	2017-10-11 04:21:51 +00:00
Jake Ehrlich	d9a283463a	Revert "[llvm-objcopy] Add support for --strip-sections to remove all section headers leaving only program headers and loadable segment data" This reverts commit rL315412 llvm-svn: 315417	2017-10-11 02:42:29 +00:00
Jake Ehrlich	b5152447ba	[llvm-objcopy] Add support for --strip-sections to remove all section headers leaving only program headers and loadable segment data elf utils implements a particularly extreme form of stripping that I'd like to support. eu-strip has an option called "strip-sections" that removes all section headers and leaves only program headers and the segment data. I have implemented this option partly as a test but mainly because in Fuchsia we would like to use this option to minimize the size of our executables. The other strip options that are on my list include --strip-all and --strip-debug. This is a preliminary implementation that I'd like to start using in Fuchsia builds if possible. This change implements such a stripping option for llvm-objcopy Differential Revision: https://reviews.llvm.org/D38335 llvm-svn: 315412	2017-10-11 01:59:06 +00:00
Craig Topper	6ce20bd184	[X86] Add 128-bit version of vbroadcasti32x2 to shuffle comment decoding. llvm-svn: 315395	2017-10-11 00:11:53 +00:00
Jake Ehrlich	fcc05627d4	[llvm-objcopy] Add ability to remove multiple sections by name This change adds the ability to use the "-R"/"-remove-section" option multiple times. Differential Revision: https://reviews.llvm.org/D38332 llvm-svn: 315385	2017-10-10 23:02:43 +00:00
Craig Topper	bb0e316dc7	[X86] Add broadcast patterns that allow a scalar_to_vector between the broadcast and the load. We already have these patterns for AVX512VL, but not AVX1 or 2. llvm-svn: 315382	2017-10-10 22:40:31 +00:00
Rafael Espindola	8f1f7b1442	Make the ELFFile constructor private. With this all clients have to use the new create method which returns an Expected. Fixes a crash on invalid input. llvm-svn: 315376	2017-10-10 22:17:49 +00:00
Rafael Espindola	ef421f9c18	Make the ELFObjectFile constructor private. This forces every user to use the new create method that returns an Expected. This in turn propagates better error messages. llvm-svn: 315371	2017-10-10 21:21:16 +00:00
Dehao Chen	3f56a05ae5	Use the first instruction's count to estimate the funciton's entry frequency. Summary: In the current implementation, we only have accurate profile count for standalone symbols. For inlined functions, we do not have entry count data because it's not available in LBR. In this patch, we use the first instruction's frequency to estimiate the function's entry count, especially for inlined functions. This may be inaccurate due to debug info in optimized code. However, this is a better estimate than the static 80/20 estimation we have in the current implementation. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D38478 llvm-svn: 315369	2017-10-10 21:13:50 +00:00
Sanjay Patel	b74063d21f	[x86] fix prefix typos for CHECK lines; NFC llvm-svn: 315368	2017-10-10 21:12:47 +00:00
Simon Dardis	b994128d14	[mips] Correct the instruction predicates for microMIPSr3 Rather than using the AdditionalPredicates mechanism to guard the microMIPS instructions, use the existing predicates to properly guard those instructions. This also resolves a case where an instruction pattern was incorrectly available for microMIPS32R6, which caused a register allocation failure as the registers specified in the pattern were not available. Reviewers: nitesh.jain, atanasyan Differential Revision: https://reviews.llvm.org/D38451 llvm-svn: 315362	2017-10-10 20:52:53 +00:00
Matt Arsenault	f42074b699	AMDGPU: Fix missing skipFunction calls llvm-svn: 315361	2017-10-10 20:48:36 +00:00

1 2 3 4 5 ...

48136 Commits