llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	071ad9c6e0	[X86] Remove and autoupgrade kand/kandn/kor/kxor/kxnor/knot intrinsics. Clang already stopped using these a couple months ago. The test cases aren't great as there is nothing forcing the operations to stay in k-registers so some of them moved back to scalar ops due to the bitcasts being moved around. llvm-svn: 324177	2018-02-03 20:18:25 +00:00
David Green	9688ed61fe	Remove unneeded -debug argument from new test llvm-svn: 324176	2018-02-03 17:33:50 +00:00
David Green	7174023f57	[InstCombine] Allow common type conversions to i8/i16/i32 This, in instcombine, allows conversions to i8/i16/i32 (very common cases) even if the resulting type is not legal according to the data layout. This can often open up extra combine opportunities. Differential Revision: https://reviews.llvm.org/D42424 llvm-svn: 324174	2018-02-03 16:51:03 +00:00
Alex Bradbury	7c11527b03	[RISCV] Update two RISCV codegen tests after rL323991 From the discussion in D41835 it looks possible the change will be backed out, but for now let's fix the RISCV tests. llvm-svn: 324172	2018-02-03 13:02:30 +00:00
Sanjay Patel	a767ee5af0	[InstCombine] make sure tests are providing coverage for the stated pattern; NFC Without extra instructions and uses, swapMayExposeCSEOpportunities() would change the icmp (as seen in the check lines), so we were not actually testing patterns that should be handled by D41480. llvm-svn: 324143	2018-02-02 21:40:54 +00:00
Craig Topper	e7e147f52c	[X86] Add avx512 command line to ptest.ll to demonstrate that 512-bit vectors are not handled by LowerVectorAllZeroTest. llvm-svn: 324130	2018-02-02 20:12:45 +00:00
Craig Topper	bd2f6e9570	Partially revert r324124 [X86] Add tests for missed opportunities to use ptest for all ones comparison. Turns out I misunderstood the flag behavior of PTEST because I read the documentation for KORTEST which is different than PTEST/KTEST and made a bad assumption. Keep the test rename though cause that's useful. llvm-svn: 324129	2018-02-02 20:12:44 +00:00
Craig Topper	9c936f88b1	[X86] Add tests for missed opportunities to use ptest for all ones comparison. Also rename the test from pr12312.ll to ptest.ll so its more recognizable. llvm-svn: 324124	2018-02-02 19:34:10 +00:00
Sanjay Patel	5b8cb26bcc	[InstCombine] add baseline tests for unsigned saturated sub (D41480); NFC llvm-svn: 324109	2018-02-02 17:43:16 +00:00
Craig Topper	e538fc74d4	[X86] Remove checks for FeatureAVX512 from the X86 assembly parser. Remove mcpu/mattr from assembly test command lines. Summary: We should always be able to accept AVX512 registers and instructions in llvm-mc. The only subtarget mode that should be checked is 16-bit vs 32-bit vs 64-bit mode. I've also removed all the mattr/mcpu lines from test RUN lines to be consistent with this. Most were due to AVX512, but a few were for other features. Fixes PR36202 Reviewers: RKSimon, echristo, bkramer Reviewed By: echristo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42824 llvm-svn: 324106	2018-02-02 17:02:58 +00:00
Yaxun Liu	2a22c5deff	[AMDGPU] Switch to the new addr space mapping by default This requires corresponding clang change. Differential Revision: https://reviews.llvm.org/D40955 llvm-svn: 324101	2018-02-02 16:07:16 +00:00
Clement Courbet	a43e9653bb	Add llc tests for comparison chains. See https://reviews.llvm.org/D42793#996098 for context. llvm-svn: 324099	2018-02-02 15:54:17 +00:00
Simon Pilgrim	1cb9bc6b6c	[X86][SSE] Force double domain for SHUFPD stack folding tests llvm-svn: 324094	2018-02-02 14:55:20 +00:00
Ivan A. Kosarev	ab68bbe515	[Analysis] Support aggregate access types in TBAA This patch implements analysis for new-format TBAA access tags with aggregate types as their final access types. Differential Revision: https://reviews.llvm.org/D41501 llvm-svn: 324092	2018-02-02 14:09:22 +00:00
James Henderson	c2dfd502a2	Add missing new files from r324077 Differential Revision: https://reviews.llvm.org/D42481 llvm-svn: 324078	2018-02-02 12:45:57 +00:00
George Rimar	76c5fae2a0	[ThinLTO] - Fix for "ThinLTO inlines variables that should be discarded". This fixes PR36187. Patch teaches ThinLTO to drop non-prevailing variables, just like we recently did for functions (in r323633). Differential revision: https://reviews.llvm.org/D42798 llvm-svn: 324075	2018-02-02 12:17:33 +00:00
Sjoerd Meijer	986d64ad73	[ARM] fixed some tabs/whitespaces in test. NFC. llvm-svn: 324074	2018-02-02 11:51:06 +00:00
Mikael Holmen	b69e5b7393	[GlobalOpt] Include padding in debug fragments Summary: When creating the debug fragments for a SRA'd variable, use the types' allocation sizes. This fixes issues where the pass would emit too small fragments, placed at the wrong offset, for padded types. An example of this is long double on x86. The type is represented using x86_fp80, which is 10 bytes, but the value is aligned to 12/16 bytes. The padding is included in the type's DW_AT_byte_size attribute; therefore, the fragments should also include that. Newer GCC releases (I tested 7.2.0) emit 12/16-byte pieces for long double. Earlier releases, e.g. GCC 5.5.0, behaved as LLVM did, i.e. by emitting a 10-byte piece, followed by an empty 2/6-byte piece for the padding. Failing to cover all `DW_AT_byte_size' bytes of a value with non-empty pieces results in the value being printed as <optimized out> by GDB. Patch by: David Stenberg Reviewers: aprantl, JDevlieghere Reviewed By: aprantl, JDevlieghere Subscribers: llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D42807 llvm-svn: 324066	2018-02-02 10:34:13 +00:00
Jonas Paulsson	422dfbf7cc	[SelectionDAG] Consider endianness in scalarizeVectorStore(). When handling vectors with non byte-sized elements, reverse the order of the elements in the built integer if the target is Big-Endian. SystemZ tests updated. Review: Eli Friedman, Ulrich Weigand. https://reviews.llvm.org/D42786 llvm-svn: 324063	2018-02-02 08:48:02 +00:00
Jonas Paulsson	0e50b6ed80	[SystemZ] Update test case (NFC) test/CodeGen/SystemZ/vec-trunc-to-i1.ll was marked as a temporary FAIL when it was previously updated when it needed one more COPY. This was however wrong, since the loop body had been reduced significantly, and it was actually an improvement. Review: Ulrich Weigand. llvm-svn: 324060	2018-02-02 07:52:02 +00:00
Shiva Chen	53489ada12	[RISCV] Add ELFObjectFileBase::getRISCVFeatures let llvm-objdump could get RISCV target feature llvm-objdump could get C feature by ELF::EF_RISCV_RVC e_flag, so then we don't have to add -mattr=+c on the command line. Differential Revision: https://reviews.llvm.org/D42629 llvm-svn: 324058	2018-02-02 06:01:02 +00:00
Craig Topper	76c5ce5184	[X86] Legalize (v64i1 (bitcast (i64 X))) on 32-bit targets by extracting 32-bit halves from i32, bitcasting each to v32i1, and concatenating. This prevents the scalarization that would otherwise occur. llvm-svn: 324057	2018-02-02 05:59:33 +00:00
Craig Topper	5570e03b21	[X86] Legalize (i64 (bitcast (v64i1 X))) on 32-bit targets by extracting to v32i1 and bitcasting to i32. This saves a trip through memory and seems to open up other combining opportunities. llvm-svn: 324056	2018-02-02 05:59:31 +00:00
Shiva Chen	b22c1d29bc	[RISCV] Fix c.addi and c.addi16sp immediate constraints which should be non-zero Differential Revision: https://reviews.llvm.org/D42782 llvm-svn: 324055	2018-02-02 02:43:23 +00:00
Shiva Chen	bbf4c5c25e	[RISCV] Define getSetCCResultType for setting vector setCC type To avoid trigger "No default SetCC type for vectors!" Assertion Differential Revision: https://reviews.llvm.org/D42675 llvm-svn: 324054	2018-02-02 02:43:18 +00:00
Amara Emerson	572f6cecf1	[AArch64][GlobalISel] Fix old use of % sigil in test. My rebase had missed the new $ sigil we're using. llvm-svn: 324051	2018-02-02 02:14:42 +00:00
Amara Emerson	58aea52bc4	[GlobalISel] Constrain the dest reg of IMPLICT_DEF. This fixes a crash where the user is a COPY, which deliberately does not constrain its source operands, resulting in a vreg without a reg class escaping selection. Differential Revision: https://reviews.llvm.org/D42697 llvm-svn: 324047	2018-02-02 01:44:43 +00:00
Matthias Braun	ca0abaebfb	SplitKit: Fix liveness recomputation in some remat cases. Example situation: ``` BB0: %0 = ... use %0 ; ... condjump BB1 jmp BB2 BB1: %0 = ... ; rematerialized def from above (from earlier split step) jmp BB2 BB2: ; ... use %0 ``` %0 will have a live interval with 3 value numbers (for the BB0, BB1 and BB2 parts). Now SplitKit tries and succeeds in rematerializing the value number in BB2 (This only works because it is a secondary split so SplitKit is can trace this back to a single original def). We need to recompute all live ranges affected by a value number that we rematerialize. The case that we missed before is that when the value that is rematerialized is at a join (Phi VNI) then we also have to recompute liveness for the predecessor VNIs. rdar://35699130 Differential Revision: https://reviews.llvm.org/D42667 llvm-svn: 324039	2018-02-02 00:08:19 +00:00
Vlad Tsyrklevich	b2c3ea7603	[cfi-verify] Add blame context printing, and improved print format. Summary: This update now allows users to specify `--blame-context` and `--blame-context-all` to print source file blame information for the source of the blame. Also updates the inline printing to correctly identify the top of the inlining stack for blame information. Patch by Mitch Phillips! Reviewers: vlad.tsyrklevich Subscribers: llvm-commits, kcc, pcc Differential Revision: https://reviews.llvm.org/D40111 llvm-svn: 324035	2018-02-01 23:45:18 +00:00
Simon Pilgrim	d1379c6df1	Fix check-prefixes typo and line endings. llvm-svn: 324024	2018-02-01 22:32:41 +00:00
Simon Pilgrim	808a0e1589	[X86][SSE] Add SSE41 to variable permute tests llvm-svn: 324017	2018-02-01 22:05:44 +00:00
Simon Pilgrim	26bf800625	[X86][XOP] Add XOP to variable permute tests llvm-svn: 324015	2018-02-01 21:57:37 +00:00
Sanjay Patel	3343fcef86	[InstCombine] allow multi-use values in canEvaluate* if all uses are in 1 inst This is the enhancement suggested in D42536 to fix a shortcoming in regular InstCombine's canEvaluate* functionality. When we have multiple uses of a value, but they're all in one instruction, we can allow that expression to be narrowed or widened for the same cost as a single-use value. AFAICT, this can only matter for multiply: sub/and/or/xor/select would be simplified away if the operands are the same value; add becomes shl; shifts with a variable shift amount aren't handled. Differential Revision: https://reviews.llvm.org/D42739 llvm-svn: 324014	2018-02-01 21:55:53 +00:00
Nemanja Ivanovic	77e34f15c9	[PowerPC] Tell VSX swap removal that scalar conversions are lane-sensitive This is a rather non-controversial change. We were missing these instructions from the list of instructions that are lane-sensitive. These two put the result into lane 0 (BE) or 3 (LE) regardless of the input. This patch fixes PR36068. llvm-svn: 324005	2018-02-01 21:09:04 +00:00
Craig Topper	a5944aade1	[DAGCombiner] When folding (insert_subvector undef, (bitcast (extract_subvector N1, Idx)), Idx) -> (bitcast N1) make sure that N1 has the same total size as the original output We were only checking the element count, but not the total width. This could cause illegal bitcasts to be created if for example the output was 512-bits, but N1 is 256 bits, and the extraction size was 128-bits. Fixes PR36199 Differential Revision: https://reviews.llvm.org/D42809 llvm-svn: 324002	2018-02-01 20:48:50 +00:00
Amara Emerson	cbc02c71a4	[GlobalISel] Fix assert failure when legalizing non-power-2 loads. Until we support extending loads properly we're going to fall back for these. We already handle stores in the same way, so this is just being consistent. llvm-svn: 324001	2018-02-01 20:47:03 +00:00
Brock Wyma	4536c1f569	[CodeView] Class record member counts should include base classes and ... Increment the field list member count for base classes and virtual base classes. Differential Revision: https://reviews.llvm.org/D41874 llvm-svn: 324000	2018-02-01 20:37:38 +00:00
Geoff Berry	94503c7bc3	[MachineCopyPropagation] Extend pass to do COPY source forwarding Summary: This change extends MachineCopyPropagation to do COPY source forwarding and adds an additional run of the pass to the default pass pipeline just after register allocation. This version of this patch uses the newly added MachineOperand::isRenamable bit to avoid forwarding registers is such a way as to violate constraints that aren't captured in the Machine IR (e.g. ABI or ISA constraints). This change is a continuation of the work started in D30751. Reviewers: qcolombet, javed.absar, MatzeB, jonpa, tstellar Subscribers: tpr, mgorny, mcrosier, nhaehnle, nemanjai, jyknight, hfinkel, arsenm, inouehrs, eraman, sdardis, guyblank, fedor.sergeev, aheejin, dschuff, jfb, myatsina, llvm-commits Differential Revision: https://reviews.llvm.org/D41835 llvm-svn: 323991	2018-02-01 18:54:01 +00:00
Changpeng Fang	29fcf883fb	AMDGPU/SI: Adjust the encoding family for D16 buffer instructions when the target has UnpackedD16VMem feature. Reviewers: Matt and Brian Differential Revision: https://reviews.llvm.org/D42548 llvm-svn: 323988	2018-02-01 18:41:33 +00:00
Simon Pilgrim	1a8cefc328	[X86][SSE] LowerBUILD_VECTORAsVariablePermute - add support for scaling index vectors This allows us to use PSHUFB for v8i16/v4i32 and VPERMD/PERMPS for v4i64/v4f64 variable shuffles. Differential Revision: https://reviews.llvm.org/D42487 llvm-svn: 323987	2018-02-01 18:10:30 +00:00
Sanjay Patel	702c19cc3e	[AArch64] add tests with sqrt estimate and ieee denorms; NFC As noted in D42323, we're not checking for denorms as we should. llvm-svn: 323985	2018-02-01 17:57:45 +00:00
Sanjay Patel	f42381fd7e	[AArch64] auto-generate complete checks; NFC llvm-svn: 323984	2018-02-01 17:44:50 +00:00
Craig Topper	7e910a9e85	[X86] Turn X86ISD::AND nodes that have no flag users back into ISD::AND just before isel to enable test instruction matching Summary: EmitTest sometimes creates X86ISD::AND specifically to hide the AND from DAG combine. But this prevents isel patterns that look for (cmp (and X, Y), 0) from being able to see it. So we end up with an AND and a TEST. The TEST gets removed by compare instruction optimization during the peephole pass. This patch attempts to fix this by converting X86ISD::AND with no flag users back into ISD::AND during the DAG preprocessing just before isel. In order to do this correctly I had to make the X86ISD::AND node created by EmitTest in this case really have a flag output. Which arguably it should have had anyway so that the number of operands would be consistent for the opcode in all cases. Then I had to modify the ReplaceAllUsesWith to understand that we might be looking at an instruction with 2 outputs. Though in this case there are no uses to replace since we just created the node, but that's what the code did before so I just made it keep working. Reviewers: spatel, RKSimon, niravd, deadalnix Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42764 llvm-svn: 323982	2018-02-01 17:08:39 +00:00
Sanjay Patel	657e5d8d41	[DAGCombiner] filter out denorm inputs when calculating sqrt estimate (PR34994) As shown in the example in PR34994: https://bugs.llvm.org/show_bug.cgi?id=34994 ...we can return a very wrong answer (inf instead of 0.0) for square root when using a reciprocal square root estimate instruction. Here, I've conditionalized the filtering out of denorms based on the function having "denormal-fp-math"="ieee" in its attributes. The other options for this attribute are 'preserve-sign' and 'positive-zero'. So we don't generate this extra code by default with just '-ffast-math' (because then there's no denormal attribute string at all), but it works if you specify '-ffast-math -fdenormal-fp-math=ieee' from clang. As noted in the review, there may be other problems in clang that affect the results depending on platform (Linux x86 at least), but this should allow creating the desired codegen. Differential Revision: https://reviews.llvm.org/D42323 llvm-svn: 323981	2018-02-01 16:57:18 +00:00
Nirav Dave	18f7f60e17	[SelectionDAG] Fix UpdateChains handling of TokenFactors Summary: In Instruction Selection UpdateChains replaces all matched Nodes' chain references including interior token factors and deletes them. This may allow nodes which depend on these interior nodes but are not part of the set of matched nodes to be left with a dangling dependence. Avoid this by doing the replacement for matched non-TokenFactor nodes. Fixes PR36164. Reviewers: jonpa, RKSimon, bogner Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D42754 llvm-svn: 323977	2018-02-01 16:11:59 +00:00
Simon Pilgrim	eb50b6d060	[X86][SSE] Add PR26491 horizontal add test llvm-svn: 323973	2018-02-01 15:30:02 +00:00
Simon Pilgrim	afc7c63bc2	[X86][AVX512DQ] Add DQ var permute 256 tests as requested on D42487 llvm-svn: 323970	2018-02-01 14:44:50 +00:00
Sjoerd Meijer	9d9a86535e	[ARM] FullFP16 LowerReturn Fix Commit r323512 introduced an optimisation in LowerReturn for half-precision return values. A missing check caused a crash when the return value is "undef" (i.e. a node that has no operands). Differential Revision: https://reviews.llvm.org/D42743 llvm-svn: 323968	2018-02-01 13:48:40 +00:00
David Green	184df0c35d	Revert commit rL323951 Looks like it's causing timeouts out on at least ppc64le buildbots. llvm-svn: 323959	2018-02-01 13:05:25 +00:00
Aleksandar Beserminji	a330c208f2	[mips] Include EVA instructions in Std2MicroMips mapping tables This patch includes EVA instructions in the Std2MicroMips mapping tables, which is required for direct object emission. Differential Revision: https://reviews.llvm.org/D41771 llvm-svn: 323958	2018-02-01 12:53:26 +00:00
Yvan Roux	490e9e6761	[ARM] Add support for unpredictable MVN instructions. This fixes bugzilla 33011 https://bugs.llvm.org/show_bug.cgi?id=33011 Defines bits {19-16} as zero or unpredictable as specified by the ARM ARM in sections A8.8.116 and A8.8.117. It fixes also the usage of PC register as destination register for MVN register-shifted register version as specified in A8.8.117. Differential Revision: https://reviews.llvm.org/D41905 llvm-svn: 323954	2018-02-01 12:06:57 +00:00
David Green	e11f0545db	[InstCombine] Allow common type conversions to i8/i16/i32 This, in instcombine, allows conversions to i8/i16/i32 (very common cases) even if the resulting type is not legal according to the data layout. This can often open up extra combine opportunities. Differential Revision: https://reviews.llvm.org/D42424 llvm-svn: 323951	2018-02-01 11:06:18 +00:00
Mikael Holmen	6d06976e74	[LSR] Don't force bases of foldable formulae to the final type. Summary: Before emitting code for scaled registers, we prevent SCEVExpander from hoisting any scaled addressing mode by emitting all the bases first. However, these bases are being forced to the final type, resulting in some odd code. For example, if the type of the base is an integer and the final type is a pointer, we will emit an inttoptr for the base, a ptrtoint for the scale, and then a 'reverse' GEP where the GEP pointer is actually the base integer and the index is the pointer. It's more intuitive to use the pointer as a pointer and the integer as index. Patch by: Bevin Hansson Reviewers: atrick, qcolombet, sanjoy Reviewed By: qcolombet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42103 llvm-svn: 323946	2018-02-01 06:38:34 +00:00
Rafael Espindola	45b12f1835	[MC] Fix assembler infinite loop on EH table using LEB padding. Fix the infinite loop reported in PR35809. It can occur with GCC-style EH table assembly, where the compiler relies on the assembler to calculate the offsets in the EH table. Also see https://sourceware.org/bugzilla/show_bug.cgi?id=4029 for the equivalent issue in the GNU assembler. Patch by Ryan Prichard! llvm-svn: 323934	2018-02-01 00:25:19 +00:00
Matt Arsenault	df0f25070c	DAG: Fix not truncating when promoting bswap/bitreverse These need to convert back to the original type, like any other promotion. llvm-svn: 323932	2018-01-31 23:54:16 +00:00
Evgeniy Stepanov	7746899f48	Revert "[ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations" Miscompiles code. Testcase pending. This reverts commit r323869. llvm-svn: 323929	2018-01-31 22:55:19 +00:00
Amjad Aboud	b86b771c02	[AggressiveInstCombine] Fixed TruncCombine class to handle TruncInst leaf node correctly. This covers the case where TruncInst leaf node is a constant expression. See PR36121 for more details. Differential Revision: https://reviews.llvm.org/D42622 llvm-svn: 323926	2018-01-31 22:39:05 +00:00
Puyan Lotfi	43e94b15ea	Followup on Proposal to move MIR physical register namespace to '$' sigil. Discussed here: http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html In preparation for adding support for named vregs we are changing the sigil for physical registers in MIR to '$' from '%'. This will prevent name clashes of named physical register with named vregs. llvm-svn: 323922	2018-01-31 22:04:26 +00:00
Chandler Carruth	0dcee4fe7a	[x86] Make the retpoline thunk insertion a machine function pass. Summary: This removes the need for a machine module pass using some deeply questionable hacks. This should address PR36123 which is a case where in full LTO the memory usage of a machine module pass actually ended up being significant. We should revert this on trunk as soon as we understand and fix the memory usage issue, but we should include this in any backports of retpolines themselves. Reviewers: echristo, MatzeB Subscribers: sanjoy, mcrosier, mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D42726 llvm-svn: 323915	2018-01-31 20:56:37 +00:00
Krzysztof Parzyszek	1108ee2496	[Hexagon] Implement HVX codegen for vector shifts llvm-svn: 323914	2018-01-31 20:49:24 +00:00
Marek Olsak	8f2df9d26c	[SeparateConstOffsetFromGEP] Fix up addrspace in the AMDGPU test llvm-svn: 323913	2018-01-31 20:49:19 +00:00
Krzysztof Parzyszek	9eb085e6cf	[Hexagon] Handle ANY_EXTEND_VECTOR_INREG in lowering llvm-svn: 323912	2018-01-31 20:48:11 +00:00
Krzysztof Parzyszek	b843f75179	[Hexagon] Handle SETCC on vector pairs in lowering llvm-svn: 323911	2018-01-31 20:46:55 +00:00
Marek Olsak	d4bb329d0e	AMDGPU: Fold inline offset for loads properly in moveToVALU on GFX9 Summary: This enables load merging into x2, x4, which is driven by inline offsets. 6500 shaders are affected: Code Size in affected shaders: -15.14 % Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D42078 llvm-svn: 323909	2018-01-31 20:18:11 +00:00
Marek Olsak	13e4741275	AMDGPU: Add intrinsics llvm.amdgcn.cvt.{pknorm.i16, pknorm.u16, pk.i16, pk.u16} Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D41663 llvm-svn: 323908	2018-01-31 20:18:04 +00:00
Marek Olsak	8e7d149a31	[SeparateConstOffsetFromGEP] Preserve metadata when splitting GEPs Summary: !amdgpu.uniform needs to be preserved for AMDGPU, otherwise bad things happen. Reviewers: arsenm, nhaehnle, jingyue, broune, majnemer, bjarke.roune, dblaikie Subscribers: wdng, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D42744 llvm-svn: 323907	2018-01-31 20:17:52 +00:00
Geoff Berry	82203c4149	[MachineOutliner] Freeze registers in new functions Summary: Call MRI.freezeReservedRegs() on functions created during outlining so that calls to isReserved() by the verifier called after this pass won't assert. Reviewers: MatzeB, qcolombet, paquette Subscribers: mcrosier, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D42749 llvm-svn: 323905	2018-01-31 20:15:16 +00:00
Sam Clegg	f9edbe95db	[WebAssembly] MC: Resolve aliases when creating provisional table entries This change is useful for the upcoming addition of the symbol table (D41954) since in that world aliases for given function all share the same function index. This change does not effect lld because it essentially ignores the wasm "table". The table exists only to the wasm objects will validate and disassembly meaningfully. Patch by Nicholas Wilson! Differential Revision: https://reviews.llvm.org/D42095 llvm-svn: 323900	2018-01-31 19:28:47 +00:00
Amaury Sechet	f9a9e9a251	[X86] Generate testl instruction through truncates. Summary: This was introduced in D42646 but ended up being reverted because the original implementation was buggy. Depends on D42646 Reviewers: craig.topper, niravd, spatel, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42741 llvm-svn: 323899	2018-01-31 19:20:06 +00:00
Chih-Hung Hsieh	60d1e79ffb	[Analysis] Disable calls to *_finite and other glibc-only functions on Android. Since r322087, glibc's finite lib calls are generated when possible. However, they are not supported on Android. This change also disables other functions not available on Android. Differential Revision: http://reviews.llvm.org/D42668 llvm-svn: 323898	2018-01-31 19:12:50 +00:00
Max Moroz	790baeed37	[llvm-cov] Improvements for summary report generated in HTML format. Summary: This commit adds the following changes: 1) coverage numbers are aligned to the left and padded with spaces in order to provide better readability for percentage values, e.g.: ``` file1 \| 89.13% (123 / 2323) \| 100.00% (55 / 55) \| 9.33% (14545 / 234234) file_asda \| 1.78% ( 23 / 4323) \| 32.31% (555 / 6555) \| 67.89% (1545 / 2234) fileXXX \| 100.00% (12323 / 12323) \| 100.00% (555 / 555) \| 100.00% (12345 / 12345) ``` 2) added "hover" attribute to CSS for highlighting table row under mouse cursor see screenshot attached to the phabricator review page {F5764813} 3) table title row and "totals" row now use bold text Reviewers: vsk, morehouse Reviewed By: vsk Subscribers: kcc, llvm-commits Differential Revision: https://reviews.llvm.org/D42093 llvm-svn: 323892	2018-01-31 17:37:21 +00:00
Daniel Neilson	be58a220e9	[CodeGenPrepare] Improve source and dest alignments of memory intrinsics independently Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the CodeGenPrepare pass to be more aggressive in improving the source and destination alignments of memcpy/memmove/memset by exploiting our new ability to record independent alignments for each argument. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 323891	2018-01-31 17:24:53 +00:00
Krzysztof Parzyszek	82a83391d3	[Hexagon] Handle BUILD_VECTOR from undef values in buildHvxVectorReg llvm-svn: 323889	2018-01-31 16:52:15 +00:00
Amaury Sechet	f89f188ddb	[X86] Avoid using high register trick for test instruction Summary: It seems it's main effect is to create addition copies when values are inr register that do not support this trick, which increase register pressure and makes the code bigger. Reviewers: craig.topper, niravd, spatel, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42646 llvm-svn: 323888	2018-01-31 16:48:54 +00:00
Krzysztof Parzyszek	8cc636c592	[Hexagon] Only process bitcasts of vsplats when selecting const vectors Selecting of constant HVX vectors involves some "manual processing", which mishandled an unrelated BITCAST operation causing a selection error. llvm-svn: 323887	2018-01-31 16:48:20 +00:00
Daniel Neilson	147810d28a	[Lint] Upgrade uses of MemoryIntrinic::getAlignment() to new API. (NFCI) Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the Lint analysis to cease using the old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 323886	2018-01-31 16:42:15 +00:00
Petar Jovanovic	540f4cd10a	[DWARF] Allow duplication of tails with CFI instructions This commit came as a result for revert of patch r317579 (originally committed as r317100). The patch made CFI instructions duplicable, because their existence in the epilogue block was affecting the Tail duplication pass. However, duplicating blocks with CFI instructions was an issue for compact unwind info on Darwin, which is why the patch was reverted. This patch allows duplicating tails with CFI instructions, though they are not duplicable, by copying them 'manually'. Patch by Djordje Kovacevic. Differential Revision: https://reviews.llvm.org/D40979 llvm-svn: 323883	2018-01-31 15:57:57 +00:00
Sanjay Patel	fd58ade81c	[InstCombine] move related tests into the same file; NFC llvm-svn: 323882	2018-01-31 15:47:59 +00:00
Sanjay Patel	8c74a9a155	[InstCombine] add tests to show limit of canEvaluate* ; NFC llvm-svn: 323881	2018-01-31 15:28:39 +00:00
Nirav Dave	c3a1e16db1	[DAG] Prevent NodeId pruning of TokenFactors in Instruction Selection. Summary: Instruction Selection preserves relative orders of all nodes save TokenFactors which we treat specially. As a result Node Ids for TokenFactors may violate the topological ordering and should not be considered as valid pruning candidates in predecessor search. Fixes PR35316. Reviewers: RKSimon, hfinkel Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D42701 llvm-svn: 323880	2018-01-31 15:23:17 +00:00
Marina Yatsina	3f34f33148	Fix build error in r323870 Change-Id: I15a8b27764a4d817cfbe48836bf09dc6520934b7 llvm-svn: 323874	2018-01-31 14:18:37 +00:00
Florian Hahn	c68428b5dc	[MachineCombiner] Add check for optimal pattern order. In D41587, @mssimpso discovered that the order of some patterns for AArch64 was sub-optimal. I thought a bit about how we could avoid that case in the future. I do not think there is a need for evaluating all patterns for now. But this patch adds an extra (expensive) check, that evaluates the latencies of all patterns, and ensures that the latency saved decreases for subsequent patterns. This catches the sub-optimal order fixed in D41587, but I am not entirely happy with the check, as it only applies to sub-optimal patterns seen while building with EXPENSIVE_CHECKS on. It did not discover any other sub-optimal pattern ordering. Reviewers: Gerolf, spatel, mssimpso Reviewed By: Gerolf, mssimpso Differential Revision: https://reviews.llvm.org/D41766 llvm-svn: 323873	2018-01-31 13:54:30 +00:00
Marina Yatsina	cd5bc4a2cd	Take into account the cost of local intervals when selecting split candidate. When selecting a split candidate for region splitting, the register allocator tries to predict which candidate will have the cheapest spill cost. Global splitting may cause the creation of local intervals, and they might spill. This patch makes RA take into account the spill cost of local split intervals in use blocks (we already take into account the spill cost in through blocks). A flag ("-condsider-local-interval-cost") controls weather we do this advanced cost calculation (it's on by default for X86 target, off for the rest). Differential Revision: https://reviews.llvm.org/D41585 Change-Id: Icccb8ad2dbf13124f5d97a18c67d95aa6be0d14d llvm-svn: 323870	2018-01-31 13:31:08 +00:00
Pablo Barrio	2e442a7831	[ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations Summary: Expressions of the form x < 0 ? 0 : x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before. Patch by Marten Svanfeldt. Reviewers: fhahn, pbarrio Reviewed By: pbarrio Subscribers: efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42574 llvm-svn: 323869	2018-01-31 13:20:10 +00:00
Amaury Sechet	99dbec52be	Add a regression test for problems caused by D42646 . NFC llvm-svn: 323868	2018-01-31 13:02:01 +00:00
Jonas Paulsson	cc5fe73669	[SystemZ] Check the bitwidth before calling isInt/isUInt. Since these methods will assert if the integer does not fit into 64 bits, it is necessary to do this check before calling them in supportedAddressingMode(). Review: Ulrich Weigand. llvm-svn: 323866	2018-01-31 12:41:25 +00:00
Amjad Aboud	d895bff5f2	[AggressiveInstCombine] Make TruncCombine class ignore unreachable basic blocks. Because dead code may contain non-standard IR that causes infinite looping or crashes in underlying analysis. See PR36134 for more details. Differential Revision: https://reviews.llvm.org/D42683 llvm-svn: 323862	2018-01-31 10:41:31 +00:00
Sjoerd Meijer	98d5359ea2	[ARM] Armv8.2-A FP16 code generation (part 2/3) Half-precision arguments and return values are passed as if it were an int or float for ARM. This results in truncates and bitcasts to/from i16 and f16 values, which are legalized very early to stack stores/loads. When FullFP16 is enabled, we want to avoid codegen for these bitcasts as it is unnecessary and inefficient. Differential Revision: https://reviews.llvm.org/D42580 llvm-svn: 323861	2018-01-31 10:18:29 +00:00
Jonas Paulsson	e6a8329e9f	[PowerPC] Return true in enableMultipleCopyHints(). Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Nemanja Ivanovic llvm-svn: 323858	2018-01-31 09:26:51 +00:00
Roger Ferrer Ibanez	aea4208720	[ARM] Allow the scheduler to clone a node with glue to avoid a copy CPSR ↔ GPR. In Thumb 1, with the new ADDCARRY / SUBCARRY the scheduler may need to do copies CPSR ↔ GPR but not all Thumb1 targets implement them. The schedule can attempt, before attempting a copy, to clone the instructions but it does not currently do that for nodes with input glue. In this patch we introduce a target-hook to let the hook decide if a glued machinenode is still eligible for copying. In this case these are ARM::tADCS and ARM::tSBCS . As a follow-up of this change we should actually implement the copies for the Thumb1 targets that do implement them and restrict the hook to the targets that can't really do such copy as these clones are not ideal. This change fixes PR35836. Differential Revision: https://reviews.llvm.org/D42051 llvm-svn: 323857	2018-01-31 09:23:43 +00:00
Justin Bogner	5106d4d21c	Mark two tests REQUIRES: x86-registered-backend These were introduced in r323783 and use an X86 triple. I'll follow up on the list to check if it would make more sense to remove the triple and mark them REQUIRES: default_triple instead. llvm-svn: 323847	2018-01-31 07:32:03 +00:00
Peter Collingbourne	7873669be5	LTO: Drop comdats when converting definitions to declarations. Differential Revision: https://reviews.llvm.org/D42715 llvm-svn: 323844	2018-01-31 02:51:03 +00:00
Eli Friedman	804d7ab811	Revert r323559 due to EXPENSIVE_CHECKS regression. I have a fix for the issue (https://reviews.llvm.org/D42655) but it's taking a while to get reviewed, so reverting in the meantime. llvm-svn: 323841	2018-01-31 00:40:42 +00:00
Craig Topper	f98baa7065	[X86] Add more madd reduction tests with wider vectors. We had no test case exercising 512-bit vpmaddwd usage. llvm-svn: 323840	2018-01-31 00:30:32 +00:00
Kevin Enderby	b95a050b98	llvm-nm should show a symbol type of T for symbols in the (__TEXT_EXEC,__text) section. When a the Apple link editor builds a kext bundle file type and the value of the -miphoneos-version-min argument is significantly current (like 11.0) then the (__TEXT,__text) section is changed to the (__TEXT_EXEC,__text) section. So it would be nice for llvm-nm to show symbols in that section with a type of T instead of the generic type of S for some section other than text, data, etc. rdar://36262205 llvm-svn: 323836	2018-01-31 00:00:41 +00:00
Krzysztof Parzyszek	119856430e	[RDF] Clear the renamable flag when copy propagating reserved registers llvm-svn: 323831	2018-01-30 23:19:44 +00:00
Yaxun Liu	c00d81e697	LLParser: add an argument for overriding data layout and do not check alloca addr space Sometimes users do not specify data layout in LLVM assembly and let llc set the data layout by target triple after loading the LLVM assembly. Currently the parser checks alloca address space no matter whether the LLVM assembly contains data layout definition, which causes false alarm since the default data layout does not contain the correct alloca address space. The parser also calls verifier to check debug info and updating invalid debug info. Currently there is no way to let the verifier to check debug info only. If the verifier finds non-debug-info issues the parser will fail. For llc, the fix is to remove the check of alloca addr space in the parser and disable updating debug info, and defer the updating of debug info and verification to be after setting data layout of the IR by target. For other llvm tools, since they do not override data layout by target but instead can override data layout by a command line option, an argument for overriding data layout is added to the parser. In cases where data layout overriding is necessary for the parser, the data layout can be provided by command line. Differential Revision: https://reviews.llvm.org/D41832 llvm-svn: 323826	2018-01-30 22:32:39 +00:00
Robert Widmann	490a5808cd	[LLVM-C] Add Accessors For A Module's Source File Name Summary: Also unblocks some cleanup in the echo-test. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: harlanhaskins, llvm-commits Differential Revision: https://reviews.llvm.org/D42618 llvm-svn: 323819	2018-01-30 21:34:29 +00:00
Vitaly Buka	59baf73a4d	[ThinLTO/gold] Write empty imports even for modules with symbols Summary: ThinLTO may skip object for other reasons, e.g. if there is no summary. Reviewers: pcc, eugenis Subscribers: mehdi_amini, inglorion, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D42514 llvm-svn: 323818	2018-01-30 21:19:26 +00:00
Evandro Menezes	b7d1729787	[AArch64] Expand testing of zero cycle zeroing Make sure that r321824 doesn't change zeroing. Differential revision: https://reviews.llvm.org/D42089 llvm-svn: 323816	2018-01-30 21:14:11 +00:00
Alexey Bataev	1c8f53f47d	[SLP] Add extra test for extractelement shuffle, NFC. llvm-svn: 323815	2018-01-30 21:06:06 +00:00
Teresa Johnson	df763188c9	Teach ValueMapper to use ODR uniqued types when available Summary: This is exposed during ThinLTO compilation, when we import an alias by creating a clone of the aliasee. Without this fix the debug type is unnecessarily cloned and we get a duplicate, undoing the uniquing. Fixes PR36089. Reviewers: mehdi_amini, pcc Subscribers: eraman, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D41669 llvm-svn: 323813	2018-01-30 20:16:32 +00:00
Jonas Devlieghere	cca341bb4e	[dsymutil] Enable -minimize feature. Passing -minimize to dsymutil prevents the emission of .debug_inlines, .debug_pubnames, and .debug_pubtypes in favor of the Apple accelerator tables. The actual check in the DWARF linker was added in r323655. This patch simply enables it. Differential revision: https://reviews.llvm.org/D42688 llvm-svn: 323812	2018-01-30 19:54:16 +00:00
Martin Storsjo	cc981d285d	[GlobalISel] Bail out on calls to dllimported functions Differential Revision: https://reviews.llvm.org/D42568 llvm-svn: 323811	2018-01-30 19:50:58 +00:00
Martin Storsjo	708498a164	[AArch64] Properly handle dllimport of variables when using fast-isel Differential Revision: https://reviews.llvm.org/D42567 llvm-svn: 323810	2018-01-30 19:50:51 +00:00
Artem Belevich	7d8f6fa86c	[TableGen] Make sure !if is evaluated throughout class inheritance. Without the patch !if() is only evaluated if it's used directly. If it's passed through more than one level of class inheritance, we end up with a reference to an anonymous record with unresolved references to the original arguments !if may have used. The root cause of the problem is that TernOpInit::isComplete() was always returning false and that prevented use of the folded value of !if() as an initializer for the record at the next level of inheritance. Differential Revision: https://reviews.llvm.org/D42695 llvm-svn: 323807	2018-01-30 19:29:21 +00:00
Sanjay Patel	ffb37a29d1	[LoopStrengthReduce] add test to show potential macro-fusion-based diff (PR35681); NFC This is the baseline output for the test proposed with D42607. llvm-svn: 323806	2018-01-30 19:17:38 +00:00
Wolfgang Pieb	d2d8e6876a	[DWARF] Recommitting a test that was removed with r323564. Restricted to x86 linux target. llvm-svn: 323804	2018-01-30 18:41:31 +00:00
Krzysztof Parzyszek	39a9842f3c	[Hexagon] Handle non-aligned offsets in globals in extender optimization Instructions like memd(r0+##global+1) are legal as long as the entire address is properly aligned. Assuming that "global" is aligned at an 8-byte boundary, the expression "global+1" appears to be misaligned. Handle such cases in HexagonConstExtenders, and make sure that any non- extended offsets generated are still aligned accordingly. llvm-svn: 323799	2018-01-30 18:12:37 +00:00
Krzysztof Parzyszek	96a284114e	Revert: [Hexagon] Make sure that offset on globals matches alignment requirements This reverts r323562, since it wasn't actually necessary. Constant- extended offsets do not need to be aligned, as long as the effective address is aligned. Keep the testcase, with a modification which checks that such offsets are not unnecessarily avoided. llvm-svn: 323798	2018-01-30 18:10:27 +00:00
Simon Pilgrim	073f089c6e	[X86][XOP] Update isVectorShiftByScalarCheap with cases covered by XOP Similar to D42437, XOP supports variable shift for v16i8/v8i16/v4i32/v2i64 types. Differential Revision: https://reviews.llvm.org/D42526 llvm-svn: 323797	2018-01-30 18:10:21 +00:00
Geoff Berry	1d53101387	[AMDGPU] isRenamable fixes to support copy forwarding Mark more opcodes as hasExtraSrcRegAllocReq so that their operands will be marked as not renamable, to avoid copy forwarding violating the constraint that only one operand may use the constant bus. These changes fix a few mis-compiles when copy forwarding is enabled in MachineCopyPropagation by D41835 (and were reviewed as part of that change). llvm-svn: 323794	2018-01-30 17:37:39 +00:00
Mark Searles	94ae3b2f9b	[AMDGPU] Revert "[AMDGPU] Add options for waitcnt pass debugging; add instr count in debug output." Patch caused a buildbot failure; arg; http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/17373/s\ teps/build_Lld/logs/stdio : /Users/buildslave/as-bldslv9/lld-x86_64-darwin13/llvm.src/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1563:18: error: unused variable 'InstCnt' [-Werror,-Wunused-variable] static int32_t InstCnt = 0; " This reverts commit 4f4a7d61e306b67044d9f16bc2016fee806bc2cc. llvm-svn: 323791	2018-01-30 17:17:06 +00:00
Mark Searles	d6d5a2571f	[AMDGPU] Add options for waitcnt pass debugging; add instr count in debug output. -amdgpu-waitcnt-forcezero={1\|0} Force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) -amdgpu-waitcnt-forceexp=<n> Force emit a s_waitcnt expcnt(0) before the first <n> instrs -amdgpu-waitcnt-forcelgkm=<n> Force emit a s_waitcnt lgkmcnt(0) before the first <n> instrs -amdgpu-waitcnt-forcevm=<n> Force emit a s_waitcnt vmcnt(0) before the first <n> instrs This patch was pushed ( abb190fd51cd2f9a9eef08c024e109f7f7e909fc ), which caused a buildbot failure, reverted ( 6227480d74da507cf8e1b4bcaffbdb9fb875b4b8 ), and then updated to fix buildbot failures (this patch). Differential Revision: https://reviews.llvm.org/D40091 llvm-svn: 323788	2018-01-30 16:49:38 +00:00
Changpeng Fang	0905870f93	AMDGPU/SI: Add decoding in the GFX80_UNPACKED decoding namespace. Reviewer: Dmitry (dp). Differential Revision: https://reviews.llvm.org/D42596 llvm-svn: 323785	2018-01-30 16:42:40 +00:00
Petar Jovanovic	9208e8fbf6	[DeadArgumentElimination] Preserve llvm.dbg.values's first argument When removing return value Dead Argument Elimination pass clobbers first llvm.dbg.value’s argument for live arguments of that function by replacing it with nullptr. In the next pass it will be deleted, so debug location about those arguments are lost. This change fixes it. Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D42541 llvm-svn: 323784	2018-01-30 16:42:04 +00:00
Saleem Abdulrasool	b36fbbc3ec	CodeGen: support an extension to pass linker options on ELF Introduce an extension to support passing linker options to the linker. These would be ignored by older linkers, but newer linkers which support this feature would be able to process the linker. Emit a special discarded section `.linker-option`. The content of this section is a pair of strings (key, value). The key is a type identifier for the parameter. This allows for an argument free parameter that will be processed by the linker with the value being the parameter. As an example, `lib` identifies a library to be linked against, traditionally the `-l` argument for Unix-based linkers with the parameter being the library name. Thanks to James Henderson, Cary Coutant, Rafael Espinolda, Sean Silva for the valuable discussion on the design of this feature. llvm-svn: 323783	2018-01-30 16:29:29 +00:00
Evandro Menezes	f1d01645a7	[AArch64] Add new target feature to fuse address generation with load or store This feature enables the fusion of the address generation and a corresponding load or store together. Differential revision: https://reviews.llvm.org/D42393 llvm-svn: 323782	2018-01-30 16:28:01 +00:00
Simon Dardis	daaeaba665	[mips] Fix incorrect sign extension for fpowi libcall PR36061 showed that during the expansion of ISD::FPOWI, that there was an incorrect zero extension of the integer argument which for MIPS64 would then give incorrect results. Address this with the existing mechanism for correcting sign extensions. This resolves PR36061. Thanks to James Cowgill for reporting the issue! Reviewers: atanasyan, hfinkel Differential Revision: https://reviews.llvm.org/D42537 llvm-svn: 323781	2018-01-30 16:24:10 +00:00
Zaara Syeda	1f59ae311b	Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass to mark candidates with coldcc attribute. This recommits r322721 reverted due to sanitizer memory leak build bot failures. Original commit message: This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 323778	2018-01-30 16:17:22 +00:00
Simon Pilgrim	05f6014257	[X86][AVX512] Add VBMI target shuffle-trunc tests llvm-svn: 323776	2018-01-30 16:01:41 +00:00
Evandro Menezes	16d7d81e5d	[AArch64] Update test cases for Exynos M3 Update any test case relevant for Exynos M3. llvm-svn: 323775	2018-01-30 15:40:27 +00:00
Evandro Menezes	9f9daa1f14	[AArch64] Add pipeline model for Exynos M3 Add the scheduling and cost model for Exynos M3. Differential revision: https://reviews.llvm.org/D42387 llvm-svn: 323773	2018-01-30 15:40:16 +00:00
Daniel Neilson	594f443b06	[RS4GC] Handle call/invoke instructions as base defining values of vectors Summary: There's an asymmetry in the definitions of findBaseDefiningValueOfVector() and findBaseDefiningValue() of RS4GC. The later handles call and invoke instructions, and the former does not. This appears to be simple oversight. This patch remedies the oversight by adding the call and invoke cases to findBaseDefiningValueOfVector(). Reviewers: DaniilSuchkov, anna Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42653 llvm-svn: 323764	2018-01-30 14:43:41 +00:00
Andrei Elovikov	ce256fd5f7	[X86FixupBWInsts] mir-simplify fixup-bw-inst.mir test. NFC. llvm-svn: 323762	2018-01-30 14:25:12 +00:00
Eric Liu	0b69b5ed85	Revert "[X86] Avoid using high register trick for test instruction" This reverts commit r323690. This causes crash in llc. See the original commit thread for details. llvm-svn: 323761	2018-01-30 14:18:33 +00:00
Simon Pilgrim	cbc2d1e111	[X86] Add test case for PR32690 llvm-svn: 323760	2018-01-30 14:15:51 +00:00
Sanjay Patel	1aef27f5cd	[DSE] make sure memory is not modified before partial store merging (PR36129) We missed a critical check in D30703. We must make sure that no intermediate store is sitting between the stores that we want to merge. This should fix: https://bugs.llvm.org/show_bug.cgi?id=36129 Differential Revision: https://reviews.llvm.org/D42663 llvm-svn: 323759	2018-01-30 13:53:59 +00:00
Amaury Sechet	25c9ee0fec	Change simple-register-allocation-read-undef.mir so that it doesn't fail if the file path contains 'dead' . NFC llvm-svn: 323748	2018-01-30 11:07:36 +00:00
Diana Picus	f72e865372	[ARM GlobalISel] Add inst selector tests for G_SITOFP and G_UITOFP These are handled by the TableGen'erated code. llvm-svn: 323732	2018-01-30 09:15:27 +00:00
Diana Picus	2a5b962030	[ARM GlobalISel] Map G_SITOFP and G_UITOFP Straightforward mapping (integer operand to GPR, floating point operand to FPR). llvm-svn: 323731	2018-01-30 09:15:23 +00:00
Diana Picus	517531e5a5	[ARM GlobalISel] Legalize G_SITOFP and G_UITOFP Legal if we have hardware support, libcall otherwise. Also add supporting code to the legalizer helper for libcalls. llvm-svn: 323730	2018-01-30 09:15:17 +00:00
Diana Picus	f5ad62d921	[ARM GlobalISel] Add inst selector tests for G_FPTOSI and G_FPTOUI The work is done by the TableGen'erated code. llvm-svn: 323728	2018-01-30 07:55:02 +00:00
Diana Picus	a2da03022c	[ARM GlobalISel] Map G_FPTOSI and G_FPTOUI Straightforward mapping (integer operand goes to GPR, floating point operand goes to FPR). llvm-svn: 323727	2018-01-30 07:54:58 +00:00
Diana Picus	4ed0ee7b5f	[ARM GlobalISel] Legalize G_FPTOSI and G_FPTOUI Legal if we have hardware support for floating point, libcalls otherwise. Also add the necessary support for libcalls in the legalizer helper. llvm-svn: 323726	2018-01-30 07:54:52 +00:00
Craig Topper	dbf0bc75e4	[X86] Auto-generate complete checks. NFC llvm-svn: 323724	2018-01-30 07:02:29 +00:00
Wolfgang Pieb	52dd7616c5	[DWARF] Corrected test committed in r323670 to use llc instead of llc_dwarf to avoid multiple triples. llvm-svn: 323721	2018-01-30 01:11:46 +00:00
Sanjay Patel	83f056604c	[InstSimplify] (X * Y) / Y --> X for relaxed floating-point ops This is the FP counterpart that was mentioned in PR35709: https://bugs.llvm.org/show_bug.cgi?id=35709 Differential Revision: https://reviews.llvm.org/D42385 llvm-svn: 323716	2018-01-30 00:18:37 +00:00
Dan Gohman	832092ca12	[SelectionDAG]: Ignore "returned" in the presence of an implicit sret. When a function return value can't be directly lowered, such as returning an i128 on WebAssembly, as indicated by the CanLowerReturn target hook, SelectionDAGBuilder can translate it to return the value through a hidden sret-like argument. If such a function has an argument with the "returned" attribute, the attribute can't be automatically lowered, because the function no longer has a normal return value. For now, just discard the "returned" attribute. This fixes PR36128. llvm-svn: 323715	2018-01-30 00:14:40 +00:00
Quentin Colombet	72f6d59841	[RAFast] Don't dereference MBB::end When RAFast sees liveins in on a basic block, it uses that information to initialize the availability of the registers. The called method uses an instruction as one of its argument and in the liveins case, RAFast was dereferencing MBB::begin which can be MBB::end for empty basic block. Change the API of definePhysReg to use MachineBasicBlock::iterator instead of MachineInstr so that we don't dereference an invalid iterator while making the call. rdar://problem/36952401 llvm-svn: 323710	2018-01-29 23:42:37 +00:00
Craig Topper	571231a7fe	[X86] Use VMOVDQA64 for aligned vXi32 stores. I meant to do this with the unaligned stores in r322820, but looks like I missed it. llvm-svn: 323708	2018-01-29 23:27:23 +00:00
Marek Olsak	48057b554c	AMDGPU: Allow a SGPR for the conditional KILL operand Patch by: Bas Nieuwenhuizen Just use the _e64 variant if needed. This should be possible as per def : Pat < (int_amdgcn_kill (i1 (setcc f32:$src, InlineFPImm<f32>:$imm, cond:$cond))), (SI_KILL_F32_COND_IMM_PSEUDO $src, (bitcast_fpimm_to_i32 $imm), (cond_as_i32imm $cond)) > ; I don't think we can get an immediate for the other operand for which we need the second 32-bit word. https://reviews.llvm.org/D42302 llvm-svn: 323706	2018-01-29 23:19:10 +00:00
Sanjay Patel	d023a9b777	[DSE] add test for PR36129; NFC We can miscompile because we're not checking is the memory might me modified between the seemingly redundant store ops. llvm-svn: 323704	2018-01-29 22:50:08 +00:00
Craig Topper	a8f87a36f1	[X86] Add FeaturePOPCNTFalseDeps to skylake server CPU to match skylake client. llvm-svn: 323700	2018-01-29 21:56:48 +00:00
Simon Pilgrim	02bdac53e7	[X86] Emit 11-byte or 15-byte NOPs on recent AMD targets, else default to 10-byte NOPs (PR22965) We currently emit up to 15-byte NOPs on all targets (apart from Silvermont), which stalls performance on some targets with decoders that struggle with 2 or 3 more '66' prefixes. This patch flags recent AMD targets (btver1/znver1) to still emit 15-byte NOPs and bdver* targets to emit 11-byte NOPs. All other targets now emit 10-byte NOPs apart from SilverMont CPUs which still emit 7-byte NOPS. Differential Revision: https://reviews.llvm.org/D42616 llvm-svn: 323693	2018-01-29 21:24:31 +00:00
Daniel Sanders	08464524c3	[ARM][GISel] PR35965 Constrain RegClasses of nested instructions built from Dst Pattern Summary: Apparently, we missed on constraining register classes of VReg-operands of all the instructions built from a destination pattern but the root (top-level) one. The issue exposed itself while selecting G_FPTOSI for armv7: the corresponding pattern generates VTOSIZS wrapped into COPY_TO_REGCLASS, so top-level COPY_TO_REGCLASS gets properly constrained, while nested VTOSIZS (or rather its destination virtual register to be exact) does not. Fixing this by issuing GIR_ConstrainSelectedInstOperands for every nested GIR_BuildMI. https://bugs.llvm.org/show_bug.cgi?id=35965 rdar://problem/36886530 Patch by Roman Tereshin Reviewers: dsanders, qcolombet, rovka, bogner, aditya_nandakumar, volkan Reviewed By: dsanders, qcolombet, rovka Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42565 llvm-svn: 323692	2018-01-29 21:09:12 +00:00
Paul Robinson	bf750c80e9	[DWARFv5] Re-enable dumping a line table with no CU. r323476 added support for DW_FORM_line_strp, and incorrectly made that depend on having a DWARFUnit available. We shouldn't be tracking .debug_line_str in DWARFUnit after all. After this patch, I can do an NFC follow up and undo a bunch of the "plumbing" part of r323476. Differential Revision: https://reviews.llvm.org/D42609 llvm-svn: 323691	2018-01-29 20:57:43 +00:00
Amaury Sechet	015184b79e	[X86] Avoid using high register trick for test instruction Summary: It seems it's main effect is to create addition copies when values are inr register that do not support this trick, which increase register pressure and makes the code bigger. The main noteworthy regression I was able to observe was pattern of the type (setcc (trunc (and X, C)), 0) where C is such as it would benefit from the hi register trick. To prevent this, a new pattern is added to materialize such pattern using a 32 bits test. This has the added benefit of working with any constant that is materializable as a 32bits immediate, not just the ones that can leverage the high register trick, as demonstrated by the test case in test-shrink.ll using the constant 2049 . Reviewers: craig.topper, niravd, spatel, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42646 llvm-svn: 323690	2018-01-29 20:54:33 +00:00
Amaury Sechet	4cbca08d71	[X86] Add test case to ensure testw is generated when optimizing for size. NFC llvm-svn: 323687	2018-01-29 20:22:46 +00:00
Jun Bum Lim	fc7d56d949	Revert "AArch64: Omit callframe setup/destroy when not necessary" This reverts commit r322917 due to multiple performance regressions in spec2006 and spec2017. XFAILed llvm/test/CodeGen/AArch64/big-callframe.ll which initially motivated this change. llvm-svn: 323683	2018-01-29 19:56:42 +00:00
Rafael Espindola	e899a0b824	Improve testcase. We now test that pic and static produce different results for bar. The function names were demangled. The attributes are written inline. llvm-svn: 323680	2018-01-29 19:37:27 +00:00
Geoff Berry	d37dc77b6e	[AMDGPU][X86][Mips] Make sure renamable bit not set for reserved regs Summary: Fix a few places that were modifying code after register allocation to set the renamable bit correctly to avoid failing the validation added in D42449. llvm-svn: 323675	2018-01-29 18:47:48 +00:00
Craig Topper	eb13ebdb99	[X86] Don't create SHRUNKBLEND when the condition is used by the true or false operand of the vselect. Fixes PR34592. Differential Revision: https://reviews.llvm.org/D42628 llvm-svn: 323672	2018-01-29 17:56:57 +00:00
Craig Topper	63db1c117a	[X86] Add test case for pr34592 llvm-svn: 323671	2018-01-29 17:56:55 +00:00
Wolfgang Pieb	9f23426cb0	[DWARF] Recommitting a test reverted in r323560. Moved to x86 directory with explicit triple. ELF support is required for type units. llvm-svn: 323670	2018-01-29 17:49:10 +00:00
Amaury Sechet	9827d8ed15	Add test case for truncated and promotion to test. NFC llvm-svn: 323663	2018-01-29 16:13:01 +00:00
Alexey Bataev	9c5c103283	[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle. Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323662	2018-01-29 16:08:52 +00:00
Alexey Bataev	10f5c9e765	[SLP] Add a test with extract for PR32086, NFC. llvm-svn: 323661	2018-01-29 15:56:52 +00:00
Jonas Devlieghere	5ead3a2b07	[dsymutil] Generate Apple accelerator tables This patch adds support for generating accelerator tables in dsymutil. This feature was already present in our internal repository but not yet upstreamed because it requires changes to the Apple accelerator table implementation. Differential revision: https://reviews.llvm.org/D42501 llvm-svn: 323655	2018-01-29 14:52:50 +00:00
Dmitry Preobrazhensky	4f321aef74	[AMDGPU][MC] Corrected parsing of image opcode modifiers r128 and d16 See bugs 36092, 36093: https://bugs.llvm.org/show_bug.cgi?id=36092 https://bugs.llvm.org/show_bug.cgi?id=36093 Differential Revision: https://reviews.llvm.org/D42583 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 323651	2018-01-29 14:20:42 +00:00
Mikael Holmen	a9e31537af	[DebugInfo] Fix fragment offset emission order for symbol locations Summary: When emitting the location for a global variable with fragmented debug expressions, make sure that the offset pieces, which represent optimized-out parts of the variable, are emitted before their succeeding fragments' expressions. Previously, if the succeeding fragment's location was a symbol, the offset piece was emitted after, rather than before, that symbol's expression. This effectively meant that the symbols were associated with the wrong parts of the variable. This fixes PR36085. Patch by: David Stenberg Reviewers: aprantl, probinson, dblaikie Reviewed By: aprantl Subscribers: JDevlieghere, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D42527 llvm-svn: 323644	2018-01-29 12:37:30 +00:00
Jonas Devlieghere	865de57bde	[Sparc] Account for bias in stack readjustment Summary: This was broken long ago in D12208, which failed to account for the fact that 64-bit SPARC uses a stack bias of 2047, and it is the unbiased value which should be aligned, not the biased one. This was seen to be an issue with Rust. Patch by: jrtc27 (James Clarke) Reviewers: jyknight, venkatra Reviewed By: jyknight Subscribers: jacob_hansen, JDevlieghere, fhahn, fedor.sergeev, llvm-commits Differential Revision: https://reviews.llvm.org/D39425 llvm-svn: 323643	2018-01-29 12:10:32 +00:00
Pavel Labath	394e805668	Refactor dwarfdump -apple-names output Summary: This modifies the dwarfdump output to align it with the new .debug_names dump. It also renames two header fields to match similar fields in the dwarf5 header. A couple of tests needed to be updated to match new output. The changes were fairly straight-forward, although not really automatable. Reviewers: JDevlieghere, aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42415 llvm-svn: 323641	2018-01-29 11:33:17 +00:00
Pavel Labath	3c9a918c9e	[DebugInfo] Basic .debug_names dumping support Summary: This commit renames DWARFAcceleratorTable to AppleAcceleratorTable to free up the first name as an interface for the different accelerator tables. Then I add a DWARFDebugNames class for the dwarf5 table. Presently, the only common functionality of the two classes is the dump() method, because this is the only method that was necessary to implement dwarfdump -debug-names; and because the rest of the AppleAcceleratorTable interface does not directly transfer to the dwarf5 tables (the main reason for that is that the present interface assumes the tables are homogeneous, but the dwarf5 tables can have different keys associated with each entry). I expect to make the common interface richer as I add more functionality to the new class (and invent a way to represent it in generic way). In terms of sharing the implementation, I found the format of the two tables sufficiently different to frustrate any attempts to have common parsing or dumping code, so presently the implementations share just low level code for formatting dwarf constants. Reviewers: vleschuk, JDevlieghere, clayborg, aprantl, probinson, echristo, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42297 llvm-svn: 323638	2018-01-29 11:08:32 +00:00
Andrei Elovikov	c560a18c7f	[X86FixupBWInsts] Fix miscompilation if sibling sub-register is live. Summary: The issues was found during D40524. Reviewers: andrew.w.kaylor, craig.topper, MatzeB Reviewed By: andrew.w.kaylor Subscribers: aivchenk, llvm-commits Differential Revision: https://reviews.llvm.org/D42533 llvm-svn: 323635	2018-01-29 09:26:04 +00:00
Oliver Stannard	a9d2e004d2	[AArch64] Generate the CASP instruction for 128-bit cmpxchg The Large System Extension added an atomic compare-and-swap instruction that operates on a pair of 64-bit registers, which we can use to implement a 128-bit cmpxchg. Because i128 is not a legal type for AArch64 we have to do all of the instruction selection in C++, and the instruction requires even/odd register pairs, so we have to wrap it in REG_SEQUENCE and EXTRACT_SUBREG nodes. This is very similar to what we do for 64-bit cmpxchg in the ARM backend. Differential revision: https://reviews.llvm.org/D42104 llvm-svn: 323634	2018-01-29 09:18:37 +00:00
George Rimar	eaf5172ca6	[ThinLTO] - Stop internalizing and drop non-prevailing symbols. Implementation marks non-prevailing symbols as not live in the summary. Then them are dropped in backends. Fixes https://bugs.llvm.org/show_bug.cgi?id=35938 Differential revision: https://reviews.llvm.org/D42107 llvm-svn: 323633	2018-01-29 08:03:30 +00:00
Craig Topper	62b62356fa	[X86] Make foldLogicOfSetCCs work better for vectors pre legal types/operations Summary: There's a check in the code to only check getSetCCResultType after LegalOperations or if the type is MVT::i1. But the i1 check is only allowing scalar types through. I think it should check that the scalar type is MVT::i1 so that it will work for vectors. The changed test already does this combine with AVX512VL where getSetCCResultType returns vXi1. But with avx512f and no VLX getSetCCResultType returns a type matching the width of the input type. Reviewers: spatel, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42619 llvm-svn: 323631	2018-01-29 07:52:55 +00:00
Davide Italiano	8b797a0fd2	[CVP] Don't Replace incoming values from unreachable blocks with undef. This pretty much reverts r322006, except that we keep the test, because we work around the issue exposed in a different way (a recursion limit in value tracking). There's still probably some sequence that exposes this problem, and the proper way to fix that for somebody who has time is outlined in the code review. llvm-svn: 323630	2018-01-29 05:59:55 +00:00
Hiroshi Inoue	c8e9245816	[NFC] fix trivial typos in comments and documents "to to" -> "to" llvm-svn: 323628	2018-01-29 05:17:03 +00:00
Florian Hahn	1636651e35	[InlineCost] Mark functions accessing varargs as not viable. This prevents functions accessing varargs from being inlined if they have the alwaysinline attribute. Reviewers: efriedma, rnk, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D42556 llvm-svn: 323619	2018-01-28 19:11:49 +00:00
Craig Topper	3913a4dd56	[X86] Fix a crash that can occur in combineExtractVectorElt due to not checking the width of a ConstantSDNode before calling getConstantOperandVal. llvm-svn: 323614	2018-01-28 07:29:35 +00:00
Craig Topper	15d69739e2	[X86] Remove VPTESTM/VPTESTNM ISD opcodes. Use isel patterns matching cmpm eq/ne with immallzeros. llvm-svn: 323612	2018-01-28 00:56:30 +00:00
Craig Topper	5e4b45361f	[X86] Add patterns for using masked vptestnmd for 256-bit vectors without VLX. We can widen the mask and extract it back down. llvm-svn: 323610	2018-01-27 23:49:14 +00:00
Craig Topper	540daee124	[X86] Add test to demonstrate missed opportunity to merge kand into testnm when using 512-bit instruction due to lack of VLX. llvm-svn: 323609	2018-01-27 23:49:11 +00:00
Justin Bogner	6e36f8250c	Add triples or specify REQUIRES: default_triple to some tests These were all failing when building the X86 backend but specifying LLVM_DEFAULT_TARGET_TRIPLE=''. llvm-svn: 323608	2018-01-27 23:31:09 +00:00
Simon Pilgrim	442aefdd22	[X86][AVX512] Add avx512dq fp2int/int2fp tests (PR31630) llvm-svn: 323607	2018-01-27 22:08:27 +00:00
Craig Topper	247016a735	[X86] Use vptestm/vptestnm for comparisons with zero to avoid creating a zero vector. We can use the same input for both operands to get a free compare with zero. We already use this trick in a couple places where we explicitly create PTESTM with the same input twice. This generalizes it. I'm hoping to remove the ISD opcodes and move this to isel patterns like we do for scalar cmp/test. llvm-svn: 323605	2018-01-27 20:19:09 +00:00
Craig Topper	513d3fa674	[X86] Remove X86ISD::PCMPGTM/PCMPEQM and instead just use X86ISD::PCMPM and pattern match the immediate value during isel. Legalization is still biased to turn LT compares in to GT by swapping operands to avoid needing extra isel patterns to commute. I'm hoping to remove TESTM/TESTNM next and this should simplify that by making EQ/NE more similar. llvm-svn: 323604	2018-01-27 20:19:02 +00:00
Simon Pilgrim	9c4fbad1d2	Regenerate test. NFCI llvm-svn: 323603	2018-01-27 19:49:46 +00:00
Simon Pilgrim	fe3fac805a	[X86][SSE] Simplify demanded elements from BROADCAST shuffle source. If broadcasting from another shuffle, attempt to simplify it. We can probably generalize this a lot more (embedding in combineX86ShufflesRecursively), but BROADCAST is one of the more troublesome as it accepts inputs of different sizes to the result. llvm-svn: 323602	2018-01-27 19:48:13 +00:00
Amaury Sechet	c131a3e548	Regenerate test result for vastart-defs-eflags.ll. NFC. llvm-svn: 323596	2018-01-27 17:52:32 +00:00
Amaury Sechet	0510b0f3d0	Regenerate test result for testb-je-fusion.ll. NFC. llvm-svn: 323595	2018-01-27 17:19:16 +00:00
Amaury Sechet	fb10aff542	Regenerate test result for stateppint-vector.ll. NFC. llvm-svn: 323594	2018-01-27 17:16:26 +00:00
Amaury Sechet	c207243405	Regenrate brcond.ll test results. NFC llvm-svn: 323593	2018-01-27 16:57:15 +00:00
Amaury Sechet	a078c2e5cb	Regenrate test results for avx-brcond.ll . NFC llvm-svn: 323592	2018-01-27 16:44:00 +00:00
Simon Pilgrim	a01a52431f	[X86][SSE] Regenerate fp2int/int2fp tests Cleanup check prefixes and check full codegen llvm-svn: 323591	2018-01-27 16:39:12 +00:00
Amaury Sechet	1ae296da36	Regenerate test results for and-su.ll . NFC llvm-svn: 323588	2018-01-27 16:00:10 +00:00
Simon Pilgrim	516cee12cc	[X86][SSE] Add broadcast from v2i32 memory tests (PR34394) llvm-svn: 323587	2018-01-27 15:54:57 +00:00
Craig Topper	2c570eaa00	[TargetLowering] Teach TargetLowering::SimplifySetCC to simplify setcc of vXi1 vectors into logic ops. This transform was already being done for setcc of scalar i1. This extends it to vectors. llvm-svn: 323585	2018-01-27 09:10:58 +00:00
Craig Topper	c80f0ced84	[SelectionDAG] Make DAGTypeLegalizer::PromoteSetCCOperands handle SETEQ/SETNE correctly for vector types. The code was using getValueSizeInBits and combining with the result of a call to DAG.ComputeNumSignBits. But for vector types getValueSizeInBits returns the width of the full vector while ComputeNumSignBits is going to give a number no larger than the width of a single element. So we should be using getScalarValueSizeInBits to get the element width. llvm-svn: 323583	2018-01-27 08:41:03 +00:00
Amara Emerson	77a5c96560	[GlobalISel][Legalizer] Convert the FP constants to the right APFloat type for G_FCONSTANT. We weren't converting the immediate ConstantFP during legalization, which caused the wrong bit patterns to be emitted for half type FP constants. Fixes PR36106. llvm-svn: 323582	2018-01-27 07:07:20 +00:00
Alexey Bataev	f86be12182	Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle." This reverts commit r323530 to fix possible problems in users code. llvm-svn: 323581	2018-01-27 02:42:21 +00:00
Vedant Kumar	cff94627cf	[InstrProfiling] Don't exit early when an unused intrinsic is found This fixes a think-o in r323574. llvm-svn: 323576	2018-01-27 00:01:04 +00:00
Craig Topper	8a444ee67c	[X86] Use vpternlog to implement vector not under AVX512. Previously we had to materialize all 1s in a register using vpternlog or pcmpeq and then xor with that. By using vpternlog directly we can do it in one operation. This is implemented using isel patterns, but we should maybe consider creating a generalized vpternlog combiner. llvm-svn: 323572	2018-01-26 22:17:40 +00:00
Sanjay Patel	5bce08ddff	[x86] auto-generate complete checks; NFC llvm-svn: 323571	2018-01-26 22:06:07 +00:00
Vedant Kumar	e48597a50e	[InstCombine] Preserve debug values for eliminable casts A cast from A to B is eliminable if its result is casted to C, and if the pair of casts could just be expressed as a single cast. E.g here, %c1 is eliminable: %c1 = zext i16 %A to i32 %c2 = sext i32 %c1 to i64 InstCombine optimizes away eliminable casts. This patch teaches it to insert a dbg.value intrinsic pointing to the final result, so that local variables pointing to the eliminable result are preserved. Differential Revision: https://reviews.llvm.org/D42566 llvm-svn: 323570	2018-01-26 22:02:52 +00:00
Krzysztof Parzyszek	90ca4e8b0c	[Hexagon] Generate constant splats instead of loads from constant pool llvm-svn: 323568	2018-01-26 21:54:56 +00:00
Wolfgang Pieb	6806cf9eb5	[DWARF] Temporarily removing test to make buildbots happy while investigating. llvm-svn: 323564	2018-01-26 21:24:22 +00:00
Krzysztof Parzyszek	d4273abb69	[Hexagon] Make sure that offset on globals matches alignment requirements A correctly aligned address may happen to be separated into a variable part and a constant part, where the constant part does not match the alignment needed in a load/store that uses this address. Such a constant cannot be used as an immediate offset in an indexed instruction. When lowering a global address, make sure that if there is an offset folded into the global, the offset is valid for all uses in load/store instructions. llvm-svn: 323562	2018-01-26 21:20:04 +00:00
Krzysztof Parzyszek	95614acc24	[Hexagon] Replace multiple vector extracts with store-load combinations llvm-svn: 323561	2018-01-26 21:17:14 +00:00
Wolfgang Pieb	06c0eca3c0	[DWARF] Temporarily removing a test that caused an independent failure on the mingw target. Will recommit once that is addressed. llvm-svn: 323560	2018-01-26 20:47:24 +00:00
Eli Friedman	29108843ff	[LivePhysRegs] Preserve pristine regs in blocks with no successors. One common source of blocks with no successors is calls to noreturn functions; we want to preserve pristine registers in case they throw an exception. The whole pristine register thing is messy (we should really prefer to explicitly model registers), but this fills a hole in the model for now. Fixes https://bugs.llvm.org/show_bug.cgi?id=36073. Differential Revision: https://reviews.llvm.org/D42509 llvm-svn: 323559	2018-01-26 20:23:00 +00:00
Alexey Bataev	7ad4e31c3b	[SLP] Test for trunc vectorization, NFC. llvm-svn: 323556	2018-01-26 20:07:55 +00:00
Craig Topper	d4795b700d	[X86] Allow any_extend to be combined with setcc on VLX targets. For VLX target getSetccResultType returns vXi1 which prevents the target independent DAG combine from doing this tranform itself. llvm-svn: 323555	2018-01-26 20:02:52 +00:00
Simon Pilgrim	8e9becbd81	[X86][AVX512] Add combining support for X86ISD::VTRUNCS Similar to the existing support for X86ISD::VTRUNCUS. Differential Revision: https://reviews.llvm.org/D42544 llvm-svn: 323553	2018-01-26 20:01:12 +00:00
Krzysztof Parzyszek	1a1edbfb04	[Hexagon] Fix an incorrect assertion in HexagonConstExtenders llvm-svn: 323548	2018-01-26 19:20:50 +00:00
Wolfgang Pieb	456b555ffe	[DWARF] Generate DWARF v5 string offsets tables along with strx* index forms. Summary: This is the producer side for DWARF v5 string offsets tables. The reader/consumer side was committed with r321295. All compile and type units in a module share a contribution to the string offsets table. Indirect strings use the strx{1,2,3,4} index forms. Reviewers: dblaikie, aprantl, JDevliegehere Differential Revision: https://reviews.llvm.org/D42021 llvm-svn: 323546	2018-01-26 18:52:58 +00:00
Simon Pilgrim	1b14bdc0b8	[X86][AVX] LowerBUILD_VECTORAsVariablePermute - add support for VPERMILPV to v4i32/v4f32 Extension to D42431, adding support for v4i32/v4f32 as well as v2i64/v2f64 now that D42308 has landed llvm-svn: 323542	2018-01-26 17:19:59 +00:00
Simon Pilgrim	76ede609f6	[X86][SSE] Don't colaesce v4i32 extracts We currently coalesce v4i32 extracts from all 4 elements to 2 v2i64 extracts + shifts/sign-extends. This seems to have been added back in the days when we tended to spill vectors and reload scalars, or ended up with repeated shuffles moving everything down to 0'th index. I don't think either of these are likely these days as we have better EXTRACT_VECTOR_ELT and VECTOR_SHUFFLE handling, and the existing code tends to make it very difficult for various vector and load combines. Differential Revision: https://reviews.llvm.org/D42308 llvm-svn: 323541	2018-01-26 17:11:34 +00:00
Nirav Dave	9896238dc9	[DAG] Teach findBaseOffset to interpret indexes of indexed memory operations Indexed outputs are addition / subtractions and can be interpreted as such. llvm-svn: 323539	2018-01-26 16:51:27 +00:00
Dmitry Preobrazhensky	706828157f	[AMDGPU][MC] Added validation of image dst/data size (must match dmask and tfe) See bug 36000: https://bugs.llvm.org/show_bug.cgi?id=36000 Differential Revision: https://reviews.llvm.org/D42483 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 323538	2018-01-26 16:42:51 +00:00
Alexander Richardson	1f9636f3ef	[MIPS] Don't crash on unsized extern types with -mgpopt Summary: This fixes an assertion when building the FreeBSD MIPS64 kernel. Reviewers: atanasyan, sdardis, emaste Reviewed By: sdardis Subscribers: krytarowski, llvm-commits Differential Revision: https://reviews.llvm.org/D42571 llvm-svn: 323536	2018-01-26 15:56:14 +00:00
Simon Pilgrim	f531cf8964	[DAGCombine] reduceBuildVecToShuffle - ensure EXTRACT_VECTOR_ELT index is in range From OSS Fuzz Test Case #5688 llvm-svn: 323535	2018-01-26 15:50:20 +00:00
Dmitry Preobrazhensky	0b4eb1ead1	[AMDGPU][MC] Added support of 64-bit image atomics See bug 35998: https://bugs.llvm.org/show_bug.cgi?id=35998 Differential Revision: https://reviews.llvm.org/D42469 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 323534	2018-01-26 15:43:29 +00:00
Simon Pilgrim	65ec923805	[X86][SSE] Add tests for vector truncation with PACKUS style signed saturation PACKUS - truncates signed value, saturating to [0,unsigned_max_trunc] llvm-svn: 323531	2018-01-26 14:58:50 +00:00
Alexey Bataev	167003df28	[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle. Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323530	2018-01-26 14:31:09 +00:00
Dmitry Preobrazhensky	6cb42e7622	[AMDGPU][MC] Enabled disassembler for image atomic operations See bug 35988: https://bugs.llvm.org/show_bug.cgi?id=35988 Differential Revision: https://reviews.llvm.org/D42186 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 323527	2018-01-26 14:07:38 +00:00
Francis Visoiu Mistrih	e4718e84e8	[MIR] Add support for addrspace in MIR Add support for printing / parsing the addrspace of a MachineMemOperand. Fixes PR35970. Differential Revision: https://reviews.llvm.org/D42502 llvm-svn: 323521	2018-01-26 11:47:28 +00:00
Daniil Fukalov	6e1dc68117	[AMDGPU] fix LDS f32 intrinsics - using qualified pointer addrspace in intrinsics class to avoid .f32 mangling - changed too common atomic mangling to ds - added missing intrinsics to AMDGPUTTIImpl::getTgtMemIntrinsic Reviewed by: b-sumner Differential Revision: https://reviews.llvm.org/D42383 llvm-svn: 323516	2018-01-26 11:09:38 +00:00
Florian Hahn	212afb9fd9	[CallSiteSplitting] Fix infinite loop when recording conditions. Fix infinite loop when recording conditions by correctly marking basic blocks as visited. Fixes https://bugs.llvm.org/show_bug.cgi?id=36105 llvm-svn: 323515	2018-01-26 10:36:50 +00:00
Momchil Velikov	d2cc6fd90b	[ARM] Accept a subset of Thumb GPR register class when emitting an SP-relative load instruction The function `Thumb1InstrInfo::loadRegFromStackSlot` accepts only the `tGPR` register class. The function serves to emit a `tLDRspi` instruction and certainly any subset of the `tGPR` register class is a valid destination of the load. Differential revision: https://reviews.llvm.org/D42535 llvm-svn: 323514	2018-01-26 10:20:58 +00:00
Andrei Elovikov	cbc5a688f3	[X86FixupBWInsts] Prefer positive checks in the test. NFC Reviewers: andrew.w.kaylor, craig.topper, MatzeB Reviewed By: andrew.w.kaylor Subscribers: aivchenk, llvm-commits Differential Revision: https://reviews.llvm.org/D42531 llvm-svn: 323513	2018-01-26 09:50:32 +00:00
Sjoerd Meijer	011de9c0ca	[ARM] Armv8.2-A FP16 code generation (part 1/3) This is the groundwork for Armv8.2-A FP16 code generation . Clang passes and returns _Float16 values as floats, together with the required bitconverts and truncs etc. to implement correct AAPCS behaviour, see D42318. We will implement half-precision argument passing/returning lowering in the ARM backend soon, but for now this means that this: _Float16 sub(_Float16 a, _Float16 b) { return a + b; } gets lowered to this: define float @sub(float %a.coerce, float %b.coerce) { entry: %0 = bitcast float %a.coerce to i32 %tmp.0.extract.trunc = trunc i32 %0 to i16 %1 = bitcast i16 %tmp.0.extract.trunc to half <SNIP> %add = fadd half %1, %3 <SNIP> } When FullFP16 is not supported, we don't make f16 a legal type, and we get legalization for "free", i.e. nothing changes and everything works as before. And also f16 argument passing/returning is handled. When FullFP16 is supported, we do make f16 a legal type, and have 2 places that we need to patch up: f16 argument passing and returning, which involves minor tweaks to avoid unnecessary code generation for some bitcasts. As a "demonstrator" that this works for the different FP16, FullFP16, softfp modes, etc., I've added match rules to the VSUB instruction description showing that we can codegen this instruction from IR, but more importantly, also to some conversion instructions. These conversions were causing issue before in the FP16 and FullFP16 cases. I've also added match rules to the VLDRH and VSTRH desriptions, so that we can actually compile the entire half-precision sub code example above. This showed that these loads and stores had the wrong addressing mode specified: AddrMode5 instead of AddrMode5FP16, which turned out not be implemented at all, so that has also been added. This is the minimal patch that shows all the different moving parts. In patch 2/3 I will add some efficient lowering of bitcasts, and in 2/3 I will add the remaining Armv8.2-A FP16 instruction descriptions. Thanks to Sam Parker and Oliver Stannard for their help and reviews! Differential Revision: https://reviews.llvm.org/D38315 llvm-svn: 323512	2018-01-26 09:26:40 +00:00
Hiroshi Inoue	0909ca132f	[NFC] fix trivial typos in comments and documents "in in" -> "in", "on on" -> "on" etc. llvm-svn: 323508	2018-01-26 08:15:29 +00:00
Shiva Chen	056d835fa4	[RISCV] Encode RISCV specific ELF e_flags to RISCV Binary by RISCVTargetStreamer llvm-svn: 323507	2018-01-26 07:53:07 +00:00
Serguei Katkov	9fe0524ee6	[CGP] Re-enable Select in complex addressing mode. Switch Select handling on after fixing two bugs: rL323192 and rL323497. llvm-svn: 323498	2018-01-26 06:26:56 +00:00
Serguei Katkov	1ce7137c99	[X86] Fix killed flag handling in X86FixupLea pass When pass creates a MOV instruction for lea (%base,%index,1), %dst => mov %base,%dst; add %index,%dst modification it should clean the killed flag for base if base is equal to index. Otherwise verifier complains about usage of killed register in add instruction. Reviewers: lsaba, zvi, zansari, aaboud Reviewed By: lsaba Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42522 llvm-svn: 323497	2018-01-26 04:49:26 +00:00
Shoaib Meenai	d8fd16b08f	[CodeGen] Ignore private symbols in llvm.used for COFF Similar to the existing handling for internal symbols, private symbols are also not visible to the linker and should be ignored. llvm-svn: 323483	2018-01-26 00:15:25 +00:00
Vedant Kumar	6394df9fc4	[Debug] LCSSA: Insert dbg.value at the first available insertion point Inserting a dbg.value instruction at the start of a basic block with a landingpad instruction triggers a verifier failure. We should be OK if we insert the instruction a bit later. Speculative fix for the bot failure described here: https://reviews.llvm.org/D42551 llvm-svn: 323482	2018-01-25 23:48:29 +00:00
Jake Ehrlich	76e9110f3d	[llvm-objcopy] Refactor llvm-objcopy to use reader and writer objects While writing code for input and output formats in llvm-objcopy it became apparent that there was a code health problem. This change attempts to solve that problem by refactoring the code to use Reader and Writer objects that can read in different objects in different formats, convert them to a single shared internal representation, and then write them to any other representation. New classes: Reader: the base class used to construct instances of the internal representation Writer: the base class used to write out instances of the internal representation ELFBuilder: a helper class for ELFWriter that takes an ELFFile and converts it to a Object SectionVisitor: it became necessary to remove writeSection from SectionBase because, under the new Reader/Writer scheme, it's possible to convert between ELF Types such as ELF32LE and ELF32BE. This isn't possible with writeSection because it (dynamically) depends on the underlying section type and (statically) depends on the ELF type. Bad things would happen if the underlying sections for ELF32LE were used for writing to ELF64BE. To avoid this code smell (which would have compiled, run, and output some nonsesnse) I decoupled writing of sections from a class. SectionWriter: This is just the ELFT templated implementation of SectionVisitor. Many classes now have this class as a friend so that the writing methods in this class can write out private data. ELFWriter: This is the Writer that outputs to ELF BinaryWriter: This is the Writer that outputs to Binary ElfType: Because the ELF Type is not a part of the Object anymore we need a way to construct the correct default Writer based on properties of the Reader. This enum just keeps track of the ELF type of the input so it can be used as the default output type as well. Object has correspondingly undergone some serious changes as well. It now has more generic methods for building and manipulating ELF binaries. This interface makes ELFBuilder easy enough to use and will make the BinaryReader/Builder easy to create as well. Most changes in this diff are cosmetic and deal with the fact that a method has been moved from one class to another or a change from a pointer to a reference. Almost no changes should result in a functional difference (this is after all a refactor). One minor functional change was made and the result can be seen in remove-shstrtab-error.test. The fact that it fails hasn't changed but the error message has changed because that failure is detected at a later point in the code now (because WriteSectionHeaders is a property of the ElfWriter not a property of the Object). I'd say roughly 80-90% of this code is cosmetically different, 10-19% is different but functionally the same, and 1-5% is functionally different despite not causing a change in tests. Differential Revision: https://reviews.llvm.org/D42222 llvm-svn: 323480	2018-01-25 22:46:17 +00:00
Easwaran Raman	6b7209b3f1	Add testcase accidentally left out from r323460. llvm-svn: 323478	2018-01-25 22:23:52 +00:00
Jake Ehrlich	ea07d3cf65	[llvm-objcopy] Add --add-gnu-debuglink This change adds support for --add-gnu-debuglink to llvm-objcopy Differential Revision: https://reviews.llvm.org/D41731 llvm-svn: 323477	2018-01-25 22:15:14 +00:00
Paul Robinson	b6aa01ca99	[DWARFv5] Support DW_FORM_line_strp in llvm-dwarfdump. This form is like DW_FORM_strp, but points to .debug_line_str instead of .debug_str as the string section. It's intended to be used from the line-table header, and allows string-pooling of directory and filenames across compilation units. Differential Revision: https://reviews.llvm.org/D42553 llvm-svn: 323476	2018-01-25 22:02:36 +00:00
Vedant Kumar	60f54084bf	[Debug] Add dbg.value intrinsics for PHIs created during LCSSA. This patch is an enhancement to propagate dbg.value information when Phis are created on behalf of LCSSA. I noticed a case where a value carried across a loop was reported as <optimized out>. Specifically this case: int bar(int x, int y) { return x + y; } int foo(int size) { int val = 0; for (int i = 0; i < size; ++i) { val = bar(val, i); // Both val and i are correct } return val; // <optimized out> } In the above case, after all of the interesting computation completes our value is reported as "optimized out." This change will add a dbg.value to correct this. This patch also moves the dbg.value insertion routine from LoopRotation.cpp into Local.cpp, so that we can share it in both places (LoopRotation and LCSSA). Patch by Matt Davis! Differential Revision: https://reviews.llvm.org/D42551 llvm-svn: 323472	2018-01-25 21:37:07 +00:00
Craig Topper	6fd634b11b	[X86] Teach Intel syntax InstPrinter to print lock prefixes that have been parsed from the asm parser. The asm parser puts the lock prefix in the MCInst flags so we need to check that in addition to TSFlags. This matches what the ATT printer does. llvm-svn: 323469	2018-01-25 21:23:57 +00:00
Aaron Ballman	4af8836398	Revert r322132; it appears to be an accidental commit, based on the commit message. The original author of the commit has not commented on whether this was accidental or purposeful, so if this revert is in error, the author can re-commit with an actual commit message. llvm-svn: 323466	2018-01-25 21:08:23 +00:00
Aaron Ballman	09f46a76d9	Reverting r323463 as it appears to be an accidental commit. Regardless, it broke a lot of build bots, so reverting back to green. http://lab.llvm.org:8011/builders/lldb-amd64-ninja-netbsd8/builds/9294 http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/24084 http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/9567 llvm-svn: 323465	2018-01-25 21:03:38 +00:00
Jake Ehrlich	df35594077	tmp llvm-svn: 323463	2018-01-25 20:24:17 +00:00
Vedant Kumar	8a816f0c9b	Revert "asan: add kernel inline instrumentation test" This reverts commit r323451. It breaks this bot: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/24077 llvm-svn: 323454	2018-01-25 18:20:19 +00:00
Krzysztof Parzyszek	b2c458e648	[Hexagon] SETEQ and SETNE are valid integer condition codes llvm-svn: 323452	2018-01-25 18:07:27 +00:00
Vedant Kumar	d22f07bbbe	asan: add kernel inline instrumentation test Patch by Andrey Konovalov! Differential Revision: https://reviews.llvm.org/D42473 llvm-svn: 323451	2018-01-25 18:05:44 +00:00
Alexey Bataev	102d4b59f9	Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle." This reverts commit r323441 to fix buildbots. llvm-svn: 323447	2018-01-25 17:28:12 +00:00
Alexey Bataev	c8cfa14b6d	[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle. Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323441	2018-01-25 16:45:18 +00:00
Sanjay Patel	1d68112c4b	[InstCombine] narrow masked zexted binops (PR35792) This is guarded by shouldChangeType(), so the tests show that we don't do the fold if the narrower type is not legal. Note that there is a proposal (D42424) that would change the results for the specific cases shown in these tests. That difference is also discussed in PR35792: https://bugs.llvm.org/show_bug.cgi?id=35792 Alive proofs for the cases handled here as well as the bitwise logic binops that we should already do better on: https://rise4fun.com/Alive/c97 https://rise4fun.com/Alive/Lc5E https://rise4fun.com/Alive/kdf llvm-svn: 323437	2018-01-25 16:34:36 +00:00
Sanjay Patel	0f95dd234d	[InstCombine] add tests for PR35792; NFC llvm-svn: 323436	2018-01-25 16:03:44 +00:00
Alexey Bataev	a0b2c78efc	Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle." This reverts commit r323430 to fix buildbots. llvm-svn: 323432	2018-01-25 15:20:29 +00:00
Alexey Bataev	ad51fe3644	[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle. Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323430	2018-01-25 15:01:36 +00:00
Simon Pilgrim	fb01d06669	[X86][SSE] Add tests for vector truncation with signed saturation AVX512 isn't using X86ISD::VTRUNCS and SSE/AVX isn't using PACKSS/PACKUS llvm-svn: 323428	2018-01-25 14:56:21 +00:00
Simon Pilgrim	e59bf81e74	[X86][SSE] Add tests for vector truncation with unsigned saturation AVX512 tends to do a good job, but there are some missed opportunities with SSE/AVX llvm-svn: 323422	2018-01-25 14:28:55 +00:00

... 3 4 5 6 7 ...

50814 Commits