llvm-project

Commit Graph

Author	SHA1	Message	Date
Jatin Bhateja	80b5e38c4e	Updating a test reference for rL312608. Differential Revision: https://reviews.llvm.org/D37501 llvm-svn: 312614	2017-09-06 03:58:14 +00:00
Craig Topper	eec768b5c4	[X86] Add more FMA3 patterns to cover a load in all 3 possible positions. This matches what we already do for AVX512. The peephole pass makes up for this in most if not all cases. But this makes isel behavior for these consistent with every other instruction. llvm-svn: 312613	2017-09-06 03:35:58 +00:00
Hal Finkel	112a6bac72	[PowerPC] Don't use xscvdpspn on the P7 xscvdpspn was not introduced until the P8, so don't use it on the P7. Fixes a regression introduced in r288152. llvm-svn: 312612	2017-09-06 03:08:26 +00:00
Jatin Bhateja	2c139f77c7	[X86] Allow cross-lane permutations for sub targets supporting AVX2. Summary: Most instructions in AVX work “in-lane”, that is, each source element is applied only to other elements of the same lane, thus a cross lane permutation is costly and needs more than one instrution. AVX2 includes instructions to perform any-to-any permutation of words over a 256-bit register and vectorized table lookup. This should also Fix PR34369 Differential Revision: https://reviews.llvm.org/D37388 llvm-svn: 312608	2017-09-06 02:58:47 +00:00
Lang Hames	6dbf0876c1	[ORC] Fix some comments in JITSymbol. Patch by Breckin Loggins. Thanks Breckin! llvm-svn: 312607	2017-09-06 02:53:37 +00:00
Eric Beckmann	0aa4b7d4c5	Fix crbug 759265 by suppressing llvm mt warnings. Summary: Previous would throw warning whenever libxml2 is not installed. Now only give this warning if merging manifest fails. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37240 llvm-svn: 312604	2017-09-06 01:50:36 +00:00
Rafael Espindola	dc8b7a96bd	Use the section name if a STT_SECTION symbol has empty name. Without this we would have multiple relocations pointing to symbols with the same name: the empty string. There was no way for yaml2obj to be able to handle that. A more general solution would be to unique symbol names in a similar way to how we unique section names. In practice I think this covers all common cases and is a bit more user friendly than using names like sym1, sym2, sym3, etc. llvm-svn: 312603	2017-09-06 00:57:53 +00:00
Yaxun Liu	fc5121a722	[AMDGPU] Transform __read_pipe_* and __write_pipe_* When packet size equals packet align and is power of 2, transform __read_pipe* and __write_pipe* to specialized library function. Differential Revision: https://reviews.llvm.org/D36831 llvm-svn: 312598	2017-09-06 00:30:27 +00:00
Sanjay Patel	6840c5ff75	[ValueTracking, InstCombine] canonicalize fcmp ord/uno with non-NAN ops to null constants This is a preliminary step towards solving the remaining part of PR27145 - IR for isfinite(): https://bugs.llvm.org/show_bug.cgi?id=27145 In order to solve that one more generally, we need to add matching for and/or of fcmp ord/uno with a constant operand. But while looking at those patterns, I realized we were missing a canonicalization for nonzero constants. Rather than limiting to just folds for constants, we're adding a general value tracking method for this based on an existing DAG helper. By transforming everything to 0.0, we can simplify the existing code in foldLogicOfFCmps() and pick up missing vector folds. Differential Revision: https://reviews.llvm.org/D37427 llvm-svn: 312591	2017-09-05 23:13:13 +00:00
Rafael Espindola	8db11a4f1c	Fix a use after free. llvm-svn: 312590	2017-09-05 23:00:51 +00:00
Eli Friedman	c22c699882	[ARM] Make ARMExpandPseudo add implicit uses for predicated instructions Missing these could potentially screw up post-ra scheduling. Issue found by inspection, so I don't have a real testcase. Included test just verifies the expected operands after expansion. Differential Revision: https://reviews.llvm.org/D35156 llvm-svn: 312589	2017-09-05 22:54:06 +00:00
Eli Friedman	06d0ee734a	[ARM] Register ARMExpandPseudo pass. This allows -run-pass etc. to refer to it. (Split off from D35156.) llvm-svn: 312587	2017-09-05 22:45:23 +00:00
Rafael Espindola	88ee57ebed	obj2yaml: Print unique section names. Without this patch passing a .o file with multiple sections with the same name to obj2yaml produces a yaml file that yaml2obj cannot handle. This is pr34162. The problem is that when specifying, for example, the section of a symbol, we get only Section: foo and don't know which of the sections whose name is foo we have to use. One alternative would be to use section numbers. This would work, but the output from obj2yaml would be very inconvenient to edit as deleting a section would invalidate all indexes. Another alternative would be to invent a unique section id that would exist only on yaml. This would work, but seems a bit heavy handed. We could make the id optional and default it to the section name. Since in the last alternative the id is basically what this patch uses as a name, it can be implemented as a followup patch if needed. llvm-svn: 312585	2017-09-05 22:30:00 +00:00
Lang Hames	4c74402601	[ORC] Convert null remote symbols to null JITSymbols. The existing code created a JITSymbol with an invalid materializer instead, guaranteeing a 'missing symbol' error when someone tried to materialize the symbol. llvm-svn: 312584	2017-09-05 22:24:40 +00:00
Zachary Turner	37c747498d	[CodeView] Don't output S_UDTs for nested typedefs. S_UDT records are basically the "bridge" between the debugger's expression evaluator and the type information. If you type (Foo)nullptr into the watch window, the debugger looks for an S_UDT record named Foo. If it can find one, it displays your type. Otherwise you get an error. We have always understood this to mean that if you have code like this: struct A { int X; }; struct B { typedef A AT; AT Member; }; that you will get 3 S_UDT records. "A", "B", and "B::AT". Because if you were to type (B::AT)nullptr into the debugger, it would need to find an S_UDT record named "B::AT". But "B::AT" is actually the S_UDT record that would be generated if B were a namespace, not a struct. So the debugger needs to be able to distinguish this case. So what it does is: 1. Look for an S_UDT named "B::AT". If it finds one, it knows that AT is in a namespace. 2. If it doesn't find one, split at the scope resolution operator, and look for an S_UDT named B. If it finds one, look up the type for B, and then look for AT as one of its members. With this algorithm, S_UDT records for nested typedefs are not just unnecessary, but actually wrong! The results of implementing this in clang are dramatic. It cuts our /DEBUG:FASTLINK PDB sizes by more than 50%, and we go from being ~20% larger than MSVC PDBs on average, to ~40% smaller. It also slightly speeds up link time. We get about 10% faster links than without this patch. Differential Revision: https://reviews.llvm.org/D37410 llvm-svn: 312583	2017-09-05 22:06:39 +00:00
Vedant Kumar	3ae4170480	Revert "[Decompression] Fail gracefully when out of memory" This reverts commit r312526. Revert "Fix test/DebugInfo/dwarfdump-decompression-invalid-size.test" This reverts commit r312527. It causes an ASan failure: http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/4150 llvm-svn: 312582	2017-09-05 22:04:00 +00:00
Davide Italiano	f887406a7d	[unittest/ReverseIteration] Unbreak when compiling with GCC. llvm-svn: 312579	2017-09-05 21:27:23 +00:00
Sanjay Patel	18e126e5d4	[InstCombine] add nnan tests; NFC As suggested in D37427, we could have a value tracking function and folds that use it to simplify these cases. llvm-svn: 312578	2017-09-05 21:20:35 +00:00
Davide Italiano	32504cf661	[GVNHoist] Move duplicated code to a helper function. NFCI. llvm-svn: 312575	2017-09-05 20:49:41 +00:00
Mandeep Singh Grang	9837e9945f	[unittests] Add reverse iteration unit test for pointer-like keys Reviewers: dblaikie, efriedma, mehdi_amini Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37241 llvm-svn: 312574	2017-09-05 20:39:01 +00:00
Reid Kleckner	d4523689a6	Fix RST syntax in LangRef for llvm.codeview.annotation intrinsic llvm-svn: 312571	2017-09-05 20:26:25 +00:00
Reid Kleckner	e33c94f1b0	Add llvm.codeview.annotation to implement MSVC __annotation Summary: This intrinsic represents a label with a list of associated metadata strings. It is modelled as reading and writing inaccessible memory so that it won't be removed as dead code. I think the intention is that the annotation strings should appear at most once in the debug info, so I marked it noduplicate. We are allowed to inline code with annotations as long as we strip the annotation, but that can be done later. Reviewers: majnemer Subscribers: eraman, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D36904 llvm-svn: 312569	2017-09-05 20:14:58 +00:00
Daniel Neilson	3f0e4ad833	[SCEV] Ensure ScalarEvolution::createAddRecFromPHIWithCastsImpl properly handles out of range truncations of the start and accum values Summary: When constructing the predicate P1 in ScalarEvolution::createAddRecFromPHIWithCastsImpl() it is possible for the PHISCEV from which the predicate is constructed to be a SCEVConstant instead of a SCEVAddRec. If this happens, then the cast<SCEVAddRec>(PHISCEV) in the code will assert. Such a PHISCEV is possible if either the start value or the accumulator value is a constant value that not equal to its truncated value, and if the truncated value is zero. This patch adds tests that demonstrate the cast<> assertion, and fixes this problem by checking whether the PHISCEV is a constant before constructing the P1 predicate; if it is, then P1 is equivalent to one of P2 or P3. Additionally, if we know that the start value or accumulator value are constants then we check whether the P2 and/or P3 predicates are known false at compile time; if either is, then we bail out of constructing the AddRec. Reviewers: sanjoy, mkazantsev, silviu.baranga Reviewed By: mkazantsev Subscribers: mkazantsev, llvm-commits Differential Revision: https://reviews.llvm.org/D37265 llvm-svn: 312568	2017-09-05 19:54:03 +00:00
Peter Collingbourne	d0e9c167d8	LTO: Try to open cache files before renaming them. It appears that a potential race between the cache client and the cache pruner that I thought was unlikely actually happened in practice [1]. Try to avoid the race condition by opening the temporary file before renaming it. Do this only on non-Windows platforms because we cannot rename open files on Windows using the sys::fs::rename function. [1] https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Fchromium.memory%2FLinux_CFI%2F1610%2F%2B%2Frecipes%2Fsteps%2Fcompile%2F0%2Fstdout Differential Revision: https://reviews.llvm.org/D37410 llvm-svn: 312567	2017-09-05 19:51:38 +00:00
Craig Topper	784fa8a4e3	[X86] Remove unnecessary (v4f32 (X86vzmovl (v4f32 (scalar_to_vector FR32X)))) patterns We had already disabled the pattern for SSE4.1 and SSE4.2. But it got re-enabled for AVX and AVX512. With SSE41 we rely on a separate (v4f32 (X86vzmovl VR128)) pattern to select blendps with a xorps to create zeroess. And a separate (v4f32 (scalar_to_vector FR32X)) to select a COPY_TO_REG_CLASS to move FR32 to VR128 The same thing can happen for AVX with vblendps and those separate patterns already exist. For AVX512, (v4f32 (X86vzmov VR128)) will select a VMOVSS instruction instead of VBLENDPS due to their not being a EVEX VBLENDPS. This is what we were getting out of the larger pattern anyway. So the larger pattern is unneeded for AVX512 too. For SSE1-SSSE3 we can rely on (v4f32 (X86vzmov VR128)) selecting a MOVSS similar to AVX512. Again this is what the larger pattern did too. So the only real change here is that AVX1/2 now properly outputs a VBLENDPS during isel instead of a VMOVSS to match SSE41. Most tests didn't notice because the two address instruction pass knows how to turn VMOVSS into VBLENDPS to get an independent destination register. llvm-svn: 312564	2017-09-05 19:09:02 +00:00
Konstantin Zhuravlyov	80528702c9	AMDGPU: Cleanup/refactor SIMemoryLegalizer [3]: - Refactor SIMemOpInfo's constructors - Allow construction of NotAtomic SIMemOpInfo Differential Revision: https://reviews.llvm.org/D37396 llvm-svn: 312563	2017-09-05 19:01:10 +00:00
Matt Arsenault	22cdb61a78	AMDGPU: Fix not accounting for tail call resource usage If the only call in a function is a tail call, the function isn't considered to have a call since it's a type of return. llvm-svn: 312561	2017-09-05 18:36:36 +00:00
Zvi Rackover	2096893f34	X86 Tests: Adding missing AVX512 fptoui coverage tests. NFC. Some of the cases show missing pattern i intend to fix shortly. llvm-svn: 312560	2017-09-05 18:24:39 +00:00
Tony Jiang	61ef1c540c	[PPC][NFC] Renaming things with 'xxinsert' moniker to 'vecinsert' to make it more general. Commit on behalf of Graham Yiu (gyiu@ca.ibm.com) llvm-svn: 312547	2017-09-05 18:08:02 +00:00
Adam Nemet	9c35f6383b	Split opt-remark YAML and opt output testing on this test This prepares for https://reviews.llvm.org/D33514 llvm-svn: 312544	2017-09-05 18:03:39 +00:00
Craig Topper	33caeadd90	[AVX512] Remove patterns for (v8f32 (X86vzmovl (insert_subvector undef, (v4f32 (scalar_to_vector FR32X:)), (iPTR 0)))) and the same for v4f64. We don't have this same pattern for AVX2 so I don't believe we should have it for AVX512. We also didn't have it for v16f32. llvm-svn: 312543	2017-09-05 17:33:58 +00:00
Konstantin Zhuravlyov	1aa667fe64	AMDGPU/NFC: Cleanup/refactor SIMemoryLegalizer [2]: - Make SIMemOpInfo a class - Add accessor methods to SIMemOpInfo - Move get*Info methods to SIMemOpInfo Differential Revision: https://reviews.llvm.org/D37395 llvm-svn: 312541	2017-09-05 16:41:25 +00:00
Konstantin Zhuravlyov	844845ae06	AMDGPU/NFC: Cleanup/refactor SIMemoryLegalizer [1]: - Rename MemOpInfo -> SIMemOpInfo - Move SIMemOpInfo class out of SIMemoryLegalizer class Differential Revision: https://reviews.llvm.org/D37394 llvm-svn: 312540	2017-09-05 16:18:05 +00:00
Simon Pilgrim	ab48e5e244	[AMDGPU] Added extra test checks to make D19325 diff clearer llvm-svn: 312537	2017-09-05 14:32:06 +00:00
Simon Pilgrim	49f9ba37d8	[X86] Limit store merge size when implicitfloat is enabled (PR34421) As suggested by @niravd : https://bugs.llvm.org/show_bug.cgi?id=34421#c2 Differential Revision: https://reviews.llvm.org/D37464 llvm-svn: 312534	2017-09-05 13:40:29 +00:00
Simon Pilgrim	60ea09eaca	Strip trailing whitespace. NFCI. llvm-svn: 312531	2017-09-05 12:32:16 +00:00
Simon Pilgrim	8dbd745b09	[X86] Regenerate scalar rotation tests llvm-svn: 312530	2017-09-05 12:28:30 +00:00
Simon Pilgrim	08246d185b	[X86][AVX512] Use AVX512 attributes instead of -mcpu in vector shift tests llvm-svn: 312529	2017-09-05 12:23:45 +00:00
Simon Pilgrim	3cbe005a69	[X86][AVX512] Use AVX512 attributes instead of -mcpu llvm-svn: 312528	2017-09-05 12:05:14 +00:00
Jonas Devlieghere	8228b8d503	Fix test/DebugInfo/dwarfdump-decompression-invalid-size.test llvm-svn: 312527	2017-09-05 11:59:16 +00:00
Jonas Devlieghere	0992d38277	[Decompression] Fail gracefully when out of memory This patch adds failing gracefully when running out of memory when allocating a buffer for decompression. This provides a work-around for: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3224 Differential revision: https://reviews.llvm.org/D37447 llvm-svn: 312526	2017-09-05 11:21:38 +00:00
Diana Picus	ac15473cdd	[ARM] GlobalISel: Minor cleanups in inst selector Use the STI member of ARMInstructionSelector instead of TII.getSubtarget() and also make use of STI's methods instead of checking the object format manually. llvm-svn: 312522	2017-09-05 08:22:47 +00:00
Diana Picus	abb088691b	[ARM] GlobalISel: Support global variables for RWPI In RWPI code, globals that are not read-only are accessed relative to the SB register (R9). This is achieved by explicitly generating an ADD instruction between SB and an offset that we either load from a constant pool or movw + movt into a register. llvm-svn: 312521	2017-09-05 07:57:41 +00:00
Craig Topper	c228d790af	[X86] Add hasSideEffects=0 and mayLoad=1 to some instructions that recently had their patterns removed. llvm-svn: 312520	2017-09-05 05:49:44 +00:00
Craig Topper	43c80be2e5	[InstCombine] Add test cases for folding (select (icmp ne/eq (and X, C1), (bitwiseop Y, C2), Y -> (bitwiseop Y, (shl/shr (and X, C1), C3)) or similar. This is possible if C1 and C2 are both powers of 2. Or if binop is 'and' then ~C2 needs to be a power of 2. We already support this for 'or', but we should be able to support 'and' and 'xor'. This will be enhanced by D37274. llvm-svn: 312519	2017-09-05 05:26:38 +00:00
Craig Topper	28d6d962d5	[InstCombine] Move foldSelectICmpAnd helper function earlier in the file to enable reuse in a future patch. llvm-svn: 312518	2017-09-05 05:26:37 +00:00
Craig Topper	4c766a0559	[InstCombine] In foldSelectIntoOp, avoid creating a Constant before we know for sure we're going to use it and avoid an unnecessary call to m_APInt. Instead of creating a Constant and then calling m_APInt with it (which will always return true). Just create an APInt initially, and use that for the checks in isSelect01 function. If it turns out we do need the Constant, create it from the APInt. This is a refactor for a future patch that will do some more checks of the constant values here. llvm-svn: 312517	2017-09-05 05:26:36 +00:00
Lang Hames	80577cb6d4	[ORC] Add some more docs/comments to the RemoteObjectLayer. llvm-svn: 312516	2017-09-05 05:06:05 +00:00
Lang Hames	67b573c62c	[ORC] Exclude RemoteObjectLayer from the ExecutionEngine module, as modules builds seem to be having trouble with it. http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules-2/builds/11401 When trying to link lli-child-target, the linker reports missing symbols for the 'Name' members of 'rpc::Function<OrcRPCNegotiate, FunctionIdT(std::string)>' (base class for OrcRPCNegotiate) and 'rpc::Function<OrcRPCResponse, void()>' (base class for OrcRPCResponse), despite there being definitions for these immediately below the rpc::Function class template. This looks like the same bug that bit OrcRemoteTargetClient/Server in r286920. <rdar://problem/34249745> llvm-svn: 312515	2017-09-05 04:31:14 +00:00
Hiroshi Inoue	614453b797	[PowerPC] eliminate redundant compare instruction If multiple conditional branches are executed based on the same comparison, we can execute multiple conditional branches based on the result of one comparison on PPC. For example, if (a == 0) { ... } else if (a < 0) { ... } can be executed by one compare and two conditional branches instead of two pairs of a compare and a conditional branch. This patch identifies a code sequence of the two pairs of a compare and a conditional branch and merge the compares if possible. To maximize the opportunity, we do canonicalization of code sequence before merging compares. For the above example, the input for this pass looks like: cmplwi r3, 0 beq 0, .LBB0_3 cmpwi r3, -1 bgt 0, .LBB0_4 So, before merging two compares, we canonicalize it as cmpwi r3, 0 ; cmplwi and cmpwi yield same result for beq beq 0, .LBB0_3 cmpwi r3, 0 ; greather than -1 means greater or equal to 0 bge 0, .LBB0_4 The generated code should be cmpwi r3, 0 beq 0, .LBB0_3 bge 0, .LBB0_4 Differential Revision: https://reviews.llvm.org/D37211 llvm-svn: 312514	2017-09-05 04:15:17 +00:00

1 2 3 4 5 ...

153786 Commits