llvm-project

Commit Graph

Author	SHA1	Message	Date
Kristina Brooks	4f197cd736	Fix incorrect Twine usage in CFGPrinter CFGPrinter (-view-cfg, -dot-cfg) invokes an undefined behaviour (dangling pointer to rvalue) on IR files with branch weights. This patch fixes the problem caused by Twine initialization and string conversion split into two statements. This change fixes the bug 37019. A similar patch to this problem was provided in the llvmlite project Patch by mcopik (Marcin Copik). Differential Revision: https://reviews.llvm.org/D52933 llvm-svn: 343984	2018-10-08 17:29:39 +00:00
Rui Ueyama	9b5a495d48	Fix a broken buildbot. llvm-svn: 343983	2018-10-08 17:24:29 +00:00
Eric Liu	bce181c64f	[clang-move] Dump whether a declaration is templated. llvm-svn: 343982	2018-10-08 17:22:50 +00:00
Kamil Rytarowski	0fbf3e997c	Disable TestCases/pthread_mutexattr_get on NetBSD The pshared feature is unsupported on NetBSD as of today. llvm-svn: 343981	2018-10-08 17:12:38 +00:00
Kamil Rytarowski	73214e316d	Fix Posix/devname_r for NetBSD NetBSD returns a different type as a return value of devname_r(3) than FreeBSD and Darwin (int vs char*). This implies that checking for successful completion of this function has to be handled differently. This test used to work well, but was switched to fix Darwin, which broke NetBSD. Add a dedicated ifdef for NetBSD and make it functional again for this OS. llvm-svn: 343980	2018-10-08 17:06:00 +00:00
Rui Ueyama	e28c146423	Avoid unnecessary buffer allocation and memcpy for compressed sections. Previously, we uncompress all compressed sections before doing anything. That works, and that is conceptually simple, but that could results in a waste of CPU time and memory if uncompressed sections are then discarded or just copied to the output buffer. In particular, if .debug_gnu_pub{names,types} are compressed and if no -gdb-index option is given, we wasted CPU and memory because we uncompress them into newly allocated bufers and then memcpy the buffers to the output buffer. That temporary buffer was redundant. This patch changes how to uncompress sections. Now, compressed sections are uncompressed lazily. To do that, `Data` member of `InputSectionBase` is now hidden from outside, and `data()` accessor automatically expands an compressed buffer if necessary. If no one calls `data()`, then `writeTo()` directly uncompresses compressed data into the output buffer. That eliminates the redundant memory allocation and redundant memcpy. This patch significantly reduces memory consumption (20 GiB max RSS to 15 Gib) for an executable whose .debug_gnu_pub{names,types} are in total 5 GiB in an uncompressed form. Differential Revision: https://reviews.llvm.org/D52917 llvm-svn: 343979	2018-10-08 16:58:59 +00:00
Nicolai Haehnle	ea36cd595c	AMDGPU: Future-proof {raw,struct}.buffer.atomic intrinsics Summary: The ISA is really supposed to support 64-bit atomics as well, so the data type should be an overload. Mesa doesn't use these atomics yet, in fact I noticed this issue while trying to use the atomics from Mesa. Change-Id: I77f58317a085a0d3eb933cc7e99308c48a19f83e Reviewers: tpr Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D52291 llvm-svn: 343978	2018-10-08 16:53:48 +00:00
Nicolai Haehnle	46c91fd233	TableGen/CodeGenDAGPatterns: addPredicateFn only once Summary: The predicate function is added in InlinePatternFragments, no need to do it here. As a result, all uses of addPredicateFn are located in InlinePatternFragments. Test confirmed that there are no changes to generated files when building all (non-experimental) targets. Change-Id: I720e42e045ca596eb0aa339fb61adf6fe71034d5 Reviewers: arsenm, rampitec, RKSimon, craig.topper, hfinkel, uweigand Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D51993 llvm-svn: 343977	2018-10-08 16:53:31 +00:00
Xin Tong	8dd92482ce	Fix test case for @r343970 op2 for weakodr symbols is 101 from bcanalyzer. llvm-svn: 343976	2018-10-08 16:38:00 +00:00
Sanjay Patel	8459465a44	[x86] add hadd test with no undefs, remove duplicate tests; NFC llvm-svn: 343975	2018-10-08 16:24:43 +00:00
Sanjay Patel	d48789c0c4	[x86] simplify hadd tests; NFC The tests from PR39195 don't use 2 parameters. That's the root problem for the pattern matching in isHorizontalBinOp(). llvm-svn: 343974	2018-10-08 15:56:28 +00:00
Neil Henning	6641657453	[AMDGPU] Add an AMDGPU specific atomic optimizer. This commit adds a new IR level pass to the AMDGPU backend to perform atomic optimizations. It works by: - Running through a function and finding atomicrmw add/sub or uses of the atomic buffer intrinsics for add/sub. - If all arguments except the value to be added/subtracted are uniform, record the value to be optimized. - Run through the atomic operations we can optimize and, depending on whether the value is uniform/divergent use wavefront wide operations (DPP in the divergent case) to calculate the total amount to be atomically added/subtracted. - Then let only a single lane of each wavefront perform the atomic operation, reducing the total number of atomic operations in flight. - Lastly we recombine the result from the single lane to each lane of the wavefront, and calculate our individual lanes offset into the final result. Differential Revision: https://reviews.llvm.org/D51969 llvm-svn: 343973	2018-10-08 15:49:19 +00:00
Sid Manning	307c7901d0	[ELF][HEXAGON] Add R_HEX_GOT_16_X support Differential Revision: https://reviews.llvm.org/D52909 llvm-svn: 343972	2018-10-08 15:32:46 +00:00
Zachary Turner	affaff8b60	Don't use back-quotes in a run line. This works on Windows, but seems to be breaking tests that use an external shell (e.g. bash) because backquote has special meaning. This particular argument wasn't crucial for the test, so I've just removed it. llvm-svn: 343971	2018-10-08 15:14:05 +00:00
Xin Tong	bfdad33b82	[ThinLTO] Keep non-prevailing (linkonce\|weak)_odr symbols live Summary: If we have a symbol with (linkonce\|weak)_odr linkage, we do not want to dead strip it even it is not prevailing. IR level (linkonce\|weak)_odr symbol can become non-prevailing when we mix ELF objects and IR objects where the (linkonce\|weak)_odr symbol in the ELF object is prevailing and the ones in the IR objects are not. Stripping them will prevent us from doing optimizations with them. By not dead stripping them, We will convert these symbols to available_externally linkage as a result of non-prevailing and eventually dropping them after inlining. I modified cache-prevailing.ll to use linkonce linkage as it is testing whether cache prevailing bit is effective or not, not we should treat linkonce_odr alive or not Reviewers: tejohnson, pcc Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D52893 llvm-svn: 343970	2018-10-08 15:12:48 +00:00
Oliver Stannard	367b4741f4	[AArch64][v8.5A] Don't create BR instructions in outliner when BTI enabled When branch target identification is enabled, we can only do indirect tail-calls through x16 or x17. This means that the outliner can't transform a BLR instruction at the end of an outlined region into a BR. Differential revision: https://reviews.llvm.org/D52869 llvm-svn: 343969	2018-10-08 14:12:08 +00:00
Oliver Stannard	c922116a51	[AArch64][v8.5A] Restrict indirect tail calls to use x16/17 only when using BTI When branch target identification is enabled, all indirectly-callable functions start with a BTI C instruction. this instruction can only be the target of certain indirect branches (direct branches and fall-through are not affected): - A BLR instruction, in either a protected or unprotected page. - A BR instruction in a protected page, using x16 or x17. - A BR instruction in an unprotected page, using any register. Without BTI, we can use any non call-preserved register to hold the address for an indirect tail call. However, when BTI is enabled, then the code being compiled might be loaded into a BTI-protected page, where only x16 and x17 can be used for indirect tail calls. Legacy code withiout this restriction can still indirectly tail-call BTI-protected functions, because they will be loaded into an unprotected page, so any register is allowed. Differential revision: https://reviews.llvm.org/D52868 llvm-svn: 343968	2018-10-08 14:09:15 +00:00
Oliver Stannard	250e5a5b65	[AArch64][v8.5A] Branch Target Identification code-generation pass The Branch Target Identification extension, introduced to AArch64 in Armv8.5-A, adds the BTI instruction, which is used to mark valid targets of indirect branches. When enabled, the processor will trap if an instruction in a protected page tries to perform an indirect branch to any instruction other than a BTI. The BTI instruction uses encodings which were NOPs in earlier versions of the architecture, so BTI-enabled code will still run on earlier hardware, just without the extra protection. There are 3 variants of the BTI instruction, which are valid targets for different kinds or branches: - BTI C can be targeted by call instructions, and is inteneded to be used at function entry points. These are the BLR instruction, as well as BR with x16 or x17. These BR instructions are allowed for use in PLT entries, and we can also use them to allow indirect tail-calls. - BTI J can be targeted by BR only, and is intended to be used by jump tables. - BTI JC acts ab both a BTI C and a BTI J instruction, and can be targeted by any BLR or BR instruction. Note that RET instructions are not restricted by branch target identification, the reason for this is that return addresses can be protected more effectively using return address signing. Direct branches and calls are also unaffected, as it is assumed that an attacker cannot modify executable pages (if they could, they wouldn't need to do a ROP/JOP attack). This patch adds a MachineFunctionPass which: - Adds a BTI C at the start of every function which could be indirectly called (either because it is address-taken, or externally visible so could be address-taken in another translation unit). - Adds a BTI J at the start of every basic block which could be indirectly branched to. This could be either done by a jump table, or by taking the address of the block (e.g. the using GCC label values extension). We only need to use BTI JC when a function is indirectly-callable, and takes the address of the entry block. I've not been able to trigger this from C or IR, but I've included a MIR test just in case. Using BTI C at function entries relies on the fact that no other code in BTI-protected pages uses indirect tail-calls, unless they use x16 or x17 to hold the address. I'll add that code-generation restriction as a separate patch. Differential revision: https://reviews.llvm.org/D52867 llvm-svn: 343967	2018-10-08 14:04:24 +00:00
Alexander Ivchenko	1aedf203dd	[GlobalIsel][X86] Support G_UDIV/G_UREM/G_SREM Support G_UDIV/G_UREM/G_SREM. The instruction selection code is taken from FastISel with only minor tweaks to adapt for GlobalISel. Differential Revision: https://reviews.llvm.org/D49781 llvm-svn: 343966	2018-10-08 13:40:34 +00:00
Sanjay Patel	60badd7584	[x86] add 16 missed hadd patterns (PR39195); NFC llvm-svn: 343965	2018-10-08 12:54:33 +00:00
David Carlier	b07407e6af	[Sanitizer] fix internal_sysctlbyname build for FreeBSD. llvm-svn: 343964	2018-10-08 12:18:19 +00:00
Haojian Wu	162510f619	[clangd] Update the out-of-date yaml-symbol-file flag in clangd. Summary: The flag is stale due to the recent changes of clangd indexer, this patch renames the flag to "index-file". Reviewers: sammccall Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D52976 llvm-svn: 343963	2018-10-08 10:44:54 +00:00
Neil Henning	57f5d0a885	[IRBuilder] Fixup CreateIntrinsic to allow specifying Types to Mangle. The IRBuilder CreateIntrinsic method wouldn't allow you to specify the types that you wanted the intrinsic to be mangled with. To fix this I've: - Added an ArrayRef<Type > member to both CreateIntrinsic overloads. - Used that array to pass into the Intrinsic::getDeclaration call. - Added a CreateUnaryIntrinsic to replace the most common use of CreateIntrinsic where the type was auto-deduced from operand 0. - Added a bunch more unit tests to test CreateIntrinsic calls that weren't being tested (including the FMF flag that wasn't checked). This was suggested as part of the AMDGPU specific atomic optimizer review (https://reviews.llvm.org/D51969). Differential Revision: https://reviews.llvm.org/D52087 llvm-svn: 343962	2018-10-08 10:32:33 +00:00
Francis Visoiu Mistrih	627d146998	[AsmParser] Return an error in the case of empty symbol ref in an expression The following instruction: > str q28, [x0, #164*@] contains a @ which is parsed as an empty symbol. The parser returns true but has no error, so the assembler continues by ignoring the instruction. Differential Revision: https://reviews.llvm.org/D52645 llvm-svn: 343961	2018-10-08 10:28:11 +00:00
Peter Smith	6f36cd4d76	[ARM] Account for implicit IT when calculating inline asm size When deciding if it is safe to optimize a conditional branch to a CBZ or CBNZ the offsets of the BasicBlocks from the start of the function are estimated. For inline assembly the generic getInlineAsmLength() function is used to get a worst case estimate of the inline assembly by multiplying the number of instructions by the max instruction size of 4 bytes. This unfortunately doesn't take into account the generation of Thumb implicit IT instructions. In edge cases such as when all the instructions in the block are 4-bytes in size and there is an implicit IT then the size is underestimated. This can cause an out of range CBZ or CBNZ to be generated. The patch takes a conservative approach and assumes that every instruction in the inline assembly block may have an implicit IT. Fixes pr31805 Differential Revision: https://reviews.llvm.org/D52834 llvm-svn: 343960	2018-10-08 09:38:28 +00:00
Oliver Stannard	9ecdac8ee0	[AArch64] Fix verifier error when outlining indirect calls The MachineOutliner for AArch64 transforms indirect calls into indirect tail calls, replacing the call with the TCRETURNri pseudo-instruction. This pseudo lowers to a BR, but has the isCall and isReturn flags set. The problem is that TCRETURNri takes a tcGPR64 as the register argument, to prevent indiret tail-calls from using caller-saved registers. The indirect calls transformed by the outliner could use caller-saved registers. This is fine, because the outliner ensures that the register is available at all call sites. However, this causes a verifier failure when the register is not in tcGPR64. The fix is to add a new pseudo-instruction like TCRETURNri, but which accepts any GPR. Differential revision: https://reviews.llvm.org/D52829 llvm-svn: 343959	2018-10-08 09:18:48 +00:00
Alex Bradbury	5af6c1496a	[RISCV] Update alu8.ll and alu16.ll test cases The srli test in alu8.ll was a no-op, as it shifted by 8 bits. Fix this, and also change the immediate in alu16.ll as shifted by something other than a poewr of 8 is more interesting. llvm-svn: 343958	2018-10-08 09:08:51 +00:00
Kristina Brooks	bcc86a95c1	[DebugInfo][PDB] Fix a signed/unsigned coversion warning Fix the following warning when compiling with clang (caused by commit rL343951): GlobalsStream.cpp:61:33: warning: comparison of integers of different signs: 'int' and 'uint32_t' This also avoids double evaluation of `GlobalsTable.HashBuckets.size()`. llvm-svn: 343957	2018-10-08 09:03:17 +00:00
Ewan Crawford	fa120cbdbc	[InstCombine] Fix incongruous GEP type addrspace Currently running the @insertelem_after_gep function below through the InstCombine pass with opt produces invalid IR. Input: ``` define void @insertelem_after_gep(<16 x i32>* %t0) { %t1 = bitcast <16 x i32>* %t0 to [16 x i32]* %t2 = addrspacecast [16 x i32]* %t1 to [16 x i32] addrspace(3)* %t3 = getelementptr inbounds [16 x i32], [16 x i32] addrspace(3)* %t2, i64 0, i64 0 %t4 = insertelement <16 x i32 addrspace(3)> undef, i32 addrspace(3) %t3, i32 0 call void @extern_vec_pointers_func(<16 x i32 addrspace(3)> %t4) ret void } ``` Output: ``` define void @insertelem_after_gep(<16 x i32> %t0) { %t3 = getelementptr inbounds <16 x i32>, <16 x i32>* %t0, i64 0, i64 0 %t4 = insertelement <16 x i32 addrspace(3)> undef, i32 addrspace(3) %t3, i32 0 call void @my_extern_func(<16 x i32 addrspace(3)> %t4) ret void } ``` Which although causes no complaints when produced, isn't valid IR as the insertelement use of the %t3 GEP expects an address space. ``` opt: /tmp/bad.ll:52:73: error: '%t3' defined with type 'i32' but expected 'i32 addrspace(3)' %t4 = insertelement <16 x i32 addrspace(3)> undef, i32 addrspace(3)* %t3, i32 0 ``` I've fixed this by adding an addrspacecast after the GEP in the InstCombine pass, and including a check for this type mismatch to the verifier. Reviewers: spatel, lebedev.ri Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52294 llvm-svn: 343956	2018-10-08 08:40:45 +00:00
Alex Bradbury	f27c67af12	[SelectionDAGBuilder][NFC] Pass LHSTy to getShiftAmountTy rather than RHSTy r126518 introduced a a type parameter to the getShiftAmountTy target hook. It produces the type of the shift (RHSTy), parameterised by the type of the value being shifted (LHSTy). SelectionDAGBuilder::visitShift passed RHSTy rather than LHSTy and this patch corrects this. The change is a no-op because in LLVM IR the LHS and RHS types for a shift must be equal anyway. llvm-svn: 343955	2018-10-08 06:24:59 +00:00
Max Kazantsev	b07369651e	[LV] Do not create SCEVs on broken IR in emitTransformedIndex. PR39160 At the point when we perform `emitTransformedIndex`, we have a broken IR (in particular, we have Phis for which not every incoming value is properly set). On such IR, it is illegal to create SCEV expressions, because their internal simplification process may try to prove some predicates and break when it stumbles across some broken IR. The only purpose of using SCEV in this particular place is attempt to simplify the generated code slightly. It seems that the result isn't worth it, because some trivial cases (like addition of zero and multiplication by 1) can be handled separately if needed, but more generally InstCombine is able to achieve the goals we want to achieve by using SCEV. This patch fixes a functional crash described in PR39160, and as side-effect it also generates a bit smarter code in some simple cases. It also may cause some optimality loss (i.e. we will now generate `mul` by power of `2` instead of shift etc), but there is nothing what InstCombine could not handle later. In case of dire need, we can support more trivial cases just in place. Note that this patch only fixes one particular case of the general problem that LV misuses SCEV, attempting to create SCEVs or prove predicates on invalid IR. The general solution, however, seems complex enough. Differential Revision: https://reviews.llvm.org/D52881 Reviewed By: fhahn, hsaito llvm-svn: 343954	2018-10-08 05:46:29 +00:00
Zachary Turner	ba73a91491	Fix a -Wsign-compare warning. llvm-svn: 343953	2018-10-08 04:44:12 +00:00
Zachary Turner	9f6ac4c264	Fix a compilation failure on non-MSVC compilers. llvm-svn: 343952	2018-10-08 04:34:41 +00:00
Zachary Turner	94926a6db8	[PDB] Add the ability to lookup global symbols by name. The Globals table is a hash table keyed on symbol name, so it's possible to lookup symbols by name in O(1) time. Add a function to the globals stream to do this, and add an option to llvm-pdbutil to exercise this, then use it to write some tests to verify correctness. llvm-svn: 343951	2018-10-08 04:19:16 +00:00
Craig Topper	98dd9d6896	Revert r343948 "[LegalizeDAG] Make one of the ReplaceNode signatures take an ArrayRef instead a pointer to an array. Add assert on size of array. NFC" The assert is failing some asan tests on the bots. llvm-svn: 343950	2018-10-08 03:12:12 +00:00
Brian Gesiak	0b56830011	[coro]Pass rvalue reference for named local variable to return_value Summary: Addressing https://bugs.llvm.org/show_bug.cgi?id=37265. Implements [class.copy]/33 of coroutines TS. When the criteria for elision of a copy/move operation are met, but not for an exception-declaration, and the object to be copied is designated by an lvalue, or when the expression in a return or co_return statement is a (possibly parenthesized) id-expression that names an object with automatic storage duration declared in the body or parameter-declaration-clause of the innermost enclosing function or lambda-expression, overload resolution to select the constructor for the copy or the return_value overload to call is first performed as if the object were designated by an rvalue. Patch by Tanoy Sinha! Reviewers: modocache, GorNishanov Reviewed By: modocache, GorNishanov Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D51741 llvm-svn: 343949	2018-10-08 03:08:39 +00:00
Craig Topper	c058a68784	[LegalizeDAG] Make one of the ReplaceNode signatures take an ArrayRef instead a pointer to an array. Add assert on size of array. NFC llvm-svn: 343948	2018-10-08 02:02:08 +00:00
Craig Topper	cd38de8b15	[LegalizeDAG] Move legalization of scatter and masked store from LegalizeVectorOps to LegalizeDAG. This is where we legalize gather and masked load so this is consistent. Since these ops are always on vectors I've chosen to go with LegalizeDAG since that's what we do for other vector only ops like BUILD_VECTOR, VECTOR_SHUFFLE, etc. The ScalarizeMaskedMemIntrinsic pass should take care of scalarizing these before SelectionDAG so hopefully we don't need to worry about illegally typed scalar ops being emitted in the legalizing. If we did we would need to do this in LegalizeVectorOps so we could get the second type legalization that runs between LegalizeVectorOps and LegalizeDAG. llvm-svn: 343947	2018-10-08 00:04:55 +00:00
Fangrui Song	8380c9e918	[clangd] Migrate to LLVM STLExtras range API llvm-svn: 343946	2018-10-07 17:21:08 +00:00
Sanjay Patel	ecc8af61e7	[DAGCombiner] allow undef elts in vector fadd matching llvm-svn: 343945	2018-10-07 16:30:42 +00:00
Sanjay Patel	f956840dbe	[x86] add vector fadd with undef elts test; NFC llvm-svn: 343944	2018-10-07 16:27:50 +00:00
Sanjay Patel	6c02c6a3a6	[x86] remove redundant tests; NFC The equivalent tests were added to the file with related folds in rL343941. llvm-svn: 343943	2018-10-07 16:13:38 +00:00
Sanjay Patel	ef76e27985	[DAGCombiner] allow undefs when matching vector splats for fmul folds llvm-svn: 343942	2018-10-07 16:05:37 +00:00
Sanjay Patel	fcb1061c13	[x86] add vector fmul with undef elts tests; NFC llvm-svn: 343941	2018-10-07 16:00:55 +00:00
Sanjay Patel	0b74c840dd	[DAGCombiner] allow undef elts in vector fabs/fneg matching This change is proposed as a part of D44548, but we need this independently to avoid regressions from improved undef propagation in SimplifyDemandedVectorElts(). llvm-svn: 343940	2018-10-07 15:32:06 +00:00
Sanjay Patel	46a9dc2e3e	[DAGCombiner] shorten code for bitcast+fabs fold; NFC llvm-svn: 343939	2018-10-07 15:18:30 +00:00
Sanjay Patel	31a3f2aaba	[x86] add tests for FP logic folding for vectors with undefs; NFC llvm-svn: 343938	2018-10-07 15:05:39 +00:00
Kirill Bobyrev	4a5ff88fdb	[clangd] NFC: Migrate to LLVM STLExtras API where possible This patch improves readability by migrating `std::function(ForwardIt start, ForwardIt end, ...)` to LLVM's STLExtras range-based equivalent `llvm::function(RangeT &&Range, ...)`. Similar change in Clang: D52576. Reviewed By: sammccall Differential Revision: https://reviews.llvm.org/D52650 llvm-svn: 343937	2018-10-07 14:49:41 +00:00
Sanjay Patel	01daf62a0d	[InstSimplify] add vector test for fneg+fdiv; NFC This should be fixed with D52934. llvm-svn: 343936	2018-10-07 14:46:33 +00:00
Simon Pilgrim	3b04a4e322	[SelectionDAG] Respect multiple uses in SimplifyDemandedBits to SimplifyDemandedVectorElts simplification rL343913 was using SimplifyDemandedBits's original demanded mask instead of the adjusted 'NewMask' that accounts for multiple uses of the op (those variable names really need improving....). Annoyingly many of the test changes (back to pre-rL343913 state) are actually safe - but only because their multiple uses are all by PMULDQ/PMULUDQ. Thanks to Jan Vesely (@jvesely) for bisecting the bug. llvm-svn: 343935	2018-10-07 11:45:46 +00:00

1 2 3 4 5 ...

300645 Commits All Branches Search

300645 Commits

All Branches