llvm-project

Commit Graph

Author	SHA1	Message	Date
Eli Friedman	7e88efa7c5	[LegalizeTypes][SVE] Support widen/split legalization for SPLAT_VECTOR Just the obvious implementation that rewrites the result type. Also fix warning from EXTRACT_SUBVECTOR legalization that triggers on the test. Differential Revision: https://reviews.llvm.org/D84706	2020-07-30 16:17:45 -07:00
Amara Emerson	09f9f7dd1b	[AArch64][GlobalISel] Add legalization & selection support for G_INTRINSIC_LRINT. Differential Revision: https://reviews.llvm.org/D84552	2020-07-30 16:14:56 -07:00
Mircea Trofin	abb8128237	[doc] Describe the header guard style clang-tidy's llvm-header-guard rule references the LLVM style - where it's missing. Differential Revision: https://reviews.llvm.org/D84989	2020-07-30 16:08:07 -07:00
LLVM GN Syncbot	b811736f8b	[gn build] Port `763671f387`	2020-07-30 22:29:22 +00:00
Lang Hames	8ce8cee1e1	[llvm-jitlink] Add -harness option to llvm-jitlink. The -harness option enables new testing use-cases for llvm-jitlink. It takes a list of objects to treat as a test harness for any regular objects passed to llvm-jitlink. If any files are passed using the -harness option then the following transformations are applied to all other files: (1) Symbols definitions that are referenced by the harness files are promoted to default scope. (This enables access to statics from test harness). (2) Symbols definitions that clash with definitions in the harness files are deleted. (This enables interposition by test harness). (3) All other definitions in regular files are demoted to local scope. (This causes untested code to be dead stripped, reducing memory cost and eliminating spurious unresolved symbol errors from untested code). These transformations allow the harness files to reference and interpose symbols in the regular object files, which can be used to support execution tests (including fuzz tests) of functions in relocatable objects produced by a build.	2020-07-30 15:26:19 -07:00
Lang Hames	9f1dcdca71	[JITLink] Allow JITLinkContext::notifyResolved to return an Error. This allows clients to detect invalid transformations applied by JITLink passes (e.g. inserting or removing symbols in unexpected ways) and terminate linking with an error. This change is used to simplify the error propagation logic in ObjectLinkingLayer.	2020-07-30 15:26:18 -07:00
Matt Arsenault	e56e9022bc	AMDGPU: Fix liveness errors when copying AGPR tuples Avoid recursively calling copyPhysReg for AGPR handling. This was dropping the necessary super register implicit defs to avoid liveness verifier errors.	2020-07-30 18:13:04 -04:00
Changpeng Fang	243376cdc7	AMDGPU: Put inexpensive ops first in AMDGPUAnnotateUniformValues::visitLoadInst Summary: This is in response to the review of https://reviews.llvm.org/D84873: The expensive check should be reordered last Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D84890	2020-07-30 14:37:06 -07:00
Nikita Popov	9ebeac6788	[ConstantRange][CVP] Make use of abs poison flag Pass the abs poison flag to the underlying ConstantRange implementation, allowing CVP to simplify based on it. Importantly, this recognizes that abs with poison flag is actually non-negative...	2020-07-30 23:06:10 +02:00
Jon Roelofs	afae6d97fa	[SelectionDAG] Fix lowering of vector geps This fixes an assertion failure that was being triggered in SelectionDAG::getZeroExtendInReg(), where it was trying to extend the <2xi32> to i64 (which should have been <2xi64>). Fixes: rdar://66016901 Differential Revision: https://reviews.llvm.org/D84884	2020-07-30 14:56:53 -06:00
Nikita Popov	94f8120cb9	[ConstantRange] Support abs with poison flag This just adds the ConstantRange support, including exhaustive testing. It's not wired up to the IR intrinsic flag yet.	2020-07-30 22:49:28 +02:00
Nikita Popov	d8a98a9c35	[ConstantRange][CVP] Compute min/max/abs intrinsic ranges Wire up ConstantRange::intrinsic() to the existing primitives for min, max and abs. The poison flag on abs is not yet taken into account.	2020-07-30 22:21:34 +02:00
Nikita Popov	95d1e668ed	[CVP] Add tests for min/max/abs intrinsic comparisons (NFC)	2020-07-30 22:17:03 +02:00
Nikita Popov	4c16eafe12	[SCCP] Remove dead switch cases based on range information Determine whether switch edges are feasible based on range information, and remove non-feasible edges lateron. This does not try to determine whether the default edge is dead, as we'd have to determine that the range is fully covered by the cases for that. Another limitation here is that we don't remove dead cases that have the same successor as a live case. I'm not handling this because I wanted to keep the edge removal based on feasible edges only, rather than inspecting ranges again there -- this does not seem like a particularly useful case to handle. Differential Revision: https://reviews.llvm.org/D84270	2020-07-30 21:21:08 +02:00
Florian Hahn	2062b3707c	[LAA] Avoid adding pointers to the checks if they are not needed. Currently we skip alias sets with only reads or a single write and no reads, but still add the pointers to the list of pointers in RtCheck. This can lead to cases where we try to access a pointer that does not exist when grouping checks. In most cases, the way we access PositionMap masked that, as the value would default to index 0. But in the example in PR46854 it causes a crash. This patch updates the logic to avoid adding pointers for alias sets that do not need any checks. It makes things slightly more verbose, by first checking the numbers of reads/writes and bailing out early if we don't need checks for the alias set. I think this makes the logic a bit simpler to follow. Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D84608	2020-07-30 19:21:14 +01:00
Sanjay Patel	7551fd5ef8	[InstCombine] update test checks; NFC	2020-07-30 14:16:48 -04:00
Ettore Tiotto	36a4f10376	Fix computeHostNumPhysicalCores() for Linux on POWER and Linux on Z ThinLTO is run using a single thread on Linux on Power. The compute_thread_count() routine calls getHostNumPhysicalCores which returns -1 by default, and so `MaxThreadCount is set to 1. unsigned llvm::ThreadPoolStrategy::compute_thread_count() const { int MaxThreadCount = UseHyperThreads ? computeHostNumHardwareThreads() : sys::getHostNumPhysicalCores(); if (MaxThreadCount <= 0) MaxThreadCount = 1; … } Fix: provide custom implementation of getHostNumPhysicalCores for Linux on Power and Linux on Z. Reviewed By: Kai, uweigand Differential Revision: https://reviews.llvm.org/D84764	2020-07-30 18:05:36 +00:00
Wouter van Oortmerssen	ce1eb7af9d	[WebAssembly] Fixed 64-bit indices in br_table LLVM selection dag assumes "switch" indices are pointer sized, which causes problems for our 32-bit br_table. The new function ensures 32-bit operands don't get unnecessarily extended, and 64-bit operands get truncated. Note that the changes to the existing test test exactly that: the addition of -NEXT in 2 places ensures no extension is inserted (which the test previously ignored) and that the wrap is present (previously omitted in wasm64 mode). Differential Revision: https://reviews.llvm.org/D84705	2020-07-30 10:52:16 -07:00
Stanislav Mekhanoshin	5b32518f96	[AMDGPU] Do not use undef on indirect source We are using undef on the indirect move source subreg and then using implicit super-reg. This creates a problem in RA when Greedy decides to split the register. It reassigns the implicit super-reg but does not bother to change undef source because it is really does not matter. The fix is to stop lying to RA and drop undef flag. This has also hit a problem in SIFoldOperands as it can fold immediate into an indirect move since there is no undef flag anymore. That results in multiple test failures, so added the check for this case. Differential Revision: https://reviews.llvm.org/D84899	2020-07-30 10:41:59 -07:00
Simon Pilgrim	4a161bd8b3	LoopUnroll.cpp - pass std::vector by const reference to needToInsertPhisForLCSSA helper. NFCI. Avoid an unnecessary pass by value.	2020-07-30 18:17:04 +01:00
Yuanfang Chen	555cf42f38	[NewPM][PassInstrument] Add PrintPass callback to StandardInstrumentations Problem: Right now, our "Running pass" is not accurate when passes are wrapped in adaptor because adaptor is never skipped and a pass could be skipped. The other problem is that "Running pass" for a adaptor is before any "Running pass" of passes/analyses it depends on. (for example, FunctionToLoopPassAdaptor). So the order of printing is not the actual order. Solution: Doing things like PassManager::Debuglogging is very intrusive because we need to specify Debuglogging whenever adaptor is created. (Actually, right now we're not specifying Debuglogging for some sub-PassManagers. Check PassBuilder) This patch move debug logging for pass as a PassInstrument callback. We could be sure that all running passes are logged and in the correct order. This could also be used to implement hierarchy pass logging in legacy PM. We could also move logging of pass manager to this if we want. The test fixes looks messy. It includes changes: - Remove PassInstrumentationAnalysis - Remove PassAdaptor - If a PassAdaptor is for a real pass, the pass is added - Pass reorder (to the correct order), related to PassAdaptor - Add missing passes (due to Debuglogging not passed down) Reviewed By: asbirlea, aeubanks Differential Revision: https://reviews.llvm.org/D84774	2020-07-30 10:07:57 -07:00
Craig Topper	3632f765dc	[WebAssembly] Fix GCC 5 build. Hans' speculative fix in `b7292f2db0` didn't work for me. This seems to.	2020-07-30 10:00:28 -07:00
Hiroshi Yamauchi	3d6f53018f	[PGO] Include the mem ops into the function hash. To avoid hash collisions when the only difference is in mem ops.	2020-07-30 09:26:20 -07:00
hsmahesha	33fd4a18e7	[AMDGPU/MemOpsCluster] Clean-up fixme's around mem ops clustering logic Get rid of all fixmes and base heuristic on `num-clustered-dwords`. The main intuition behind this is as follows. The existing heuristic roughly summarizes as below: * Assume, all the mem ops instructions participating in the clustering process, loads/stores same num bytes * If num bytes loaded by each mem op is 4 bytes, then cluster at max 5 mem ops, that is at max 20 bytes * If num bytes loaded by each mem op is 8 bytes, then cluster at max 3 mem ops, that is at max 24 bytes * If num bytes loaded by each mem op is 16 bytes, then cluster at max 2 mem ops, that is at max 32 bytes So, we need to make sure that the new heuristic do not completey deviate away from the above one, and it properly handles both the sub-word loads and the wide loads. Reviewed By: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D84354	2020-07-30 21:41:13 +05:30
Brendon Cahoon	7b114446c3	Align store conditional address In cases where the alignment of the datatype is smaller than expected by the instruction, the address is aligned. The aligned address is used for the load, but wasn't used for the store conditional, which resulted in a run-time alignment exception.	2020-07-30 10:42:00 -05:00
Fangrui Song	d2c2248722	[X86] Parse and ignore .arch directives We parse .arch so that some `.arch i386; .code32` code can assemble. It seems that X86AsmParser does not do a good job tracking what features are needed to assemble instructions. GNU as's x86 port supports a very wide range of .arch operands. Ignore the operand for now. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D84900	2020-07-30 08:30:06 -07:00
Johannes Doerfert	19756ef53a	[OpenMP][IRBuilder] Support allocas in nested parallel regions We need to keep track of the alloca insertion point (which we already communicate via the callback to the user) as we place allocas as well. Reviewed By: fghanim, SouraVX Differential Revision: https://reviews.llvm.org/D82470	2020-07-30 10:19:39 -05:00
Momchil Velikov	ef4e665435	[AArch64] Fix operand definitions of XPACI/XPACD The operand to these instructions is both input and output. These are not yet emitted by the compiler and the assembler already works fine, so can't test in this patch. But D75044 will use XPACI and provide test coverage for this patch as well. Differential Revision: https://reviews.llvm.org/D84298	2020-07-30 15:31:44 +01:00
Matt Arsenault	b8c8d1b309	AMDGPU: Convert some tests to use new buffer intrinsics The legacy not struct or raw buffer intrinsics should now all be consolidated into the tests specifically for those intrinsics.	2020-07-30 10:30:43 -04:00
Simon Pilgrim	6316b0023e	Attributor.h - remove unnecessary includes. NFCI. Fix implicit cpp include dependencies.	2020-07-30 15:26:41 +01:00
Jinsong Ji	dab8d6104b	[PowerPC][AIX] Move the testcase to proper dir	2020-07-30 14:25:59 +00:00
Hans Wennborg	b7292f2db0	Speculative GCC 5 build fix It's complaining about specializing the template in a different namespace.	2020-07-30 16:12:52 +02:00
jasonliu	04dc9691eb	[XCOFF][AIX] Enable -ffunction-sections Summary: This patch implements -ffunction-sections on AIX. This patch focuses on assembly generation. Follow-on patch needs to handle: 1. -ffunction-sections implication for jump table. 2. Object file generation path and associated testing. Differential Revision: https://reviews.llvm.org/D83875	2020-07-30 13:30:01 +00:00
Sanjay Patel	f7237ee74f	[ConstantFolding] add tests for abs intrinsic; NFC	2020-07-30 09:28:30 -04:00
David Green	1da0c47fa2	[LoopVectorizer] Don't create unused block masks for reductions. NFC This removes some unneeded block masks when we don't have any reductions. It should not have any effect on codegen as the values created are dead anyway. Differential Revision: https://reviews.llvm.org/D81415	2020-07-30 14:28:08 +01:00
Florian Hahn	59d6e814ce	Revert "[IPConstProp] Remove and move tests to SCCP." This reverts commit `e77624a3be`. Looks like some clang tests manually invoke -ipconstprop via opt.....	2020-07-30 13:06:54 +01:00
Florian Hahn	e77624a3be	[IPConstProp] Remove and move tests to SCCP. As far as I know, ipconstprop has not been used in years and ipsccp has been used instead. This has the potential for confusion and sometimes leads people to spend time finding & reporting bugs as well as updating it to work with the latest API changes. This patch moves the tests over to SCCP. There's one functional difference I am aware of: ipconstprop propagates for each call-site individually, so for functions that are called with different constant arguments it can sometimes produce better results than ipsccp (at much higher compile-time cost).But IPSCCP can be thought to do so as well for internal functions and as mentioned earlier, the pass seems unused in practice (and there are no plans on working towards enabling it anytime). Also discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2020-July/143773.html Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D84447	2020-07-30 12:36:27 +01:00
Simon Pilgrim	cc529285fd	VectorUtils.h - reduce unnecessary includes. NFC. Replace TargetLibraryInfo.h include with forward declaration and fix implicit dependencies. Reduce SmallSet.h include to SmallVector.h include.	2020-07-30 12:27:49 +01:00
Simon Pilgrim	2dec72ba5c	[X86][SSE] combineExtractWithShuffle - extend extract(truncate(x),0) for any source vector size As long as we can extract the lowest 128-bit subvector from the pre-truncated source vector, then we don't care what size it is. The next stage will be to support non-zero extraction indices, as long as its still coming from the lowest 128-bit subvector.	2020-07-30 12:27:49 +01:00
Florian Hahn	44a4ba859d	[AArch64] Add machine-combiner tests with instruction level FMFs.	2020-07-30 11:41:09 +01:00
Esme-Yi	141b64a340	[NFC] Failed cases for some patterns defined in DAGCombiner.cpp	2020-07-30 10:05:04 +00:00
Xing GUO	3da6a974db	[DWARFYAML] Make the 'Length' field of the address range table optional. This patch makes the 'Length' field of the address range table optional. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D84911	2020-07-30 17:42:18 +08:00
Xing GUO	006f6f8ac6	[DWARFYAML] Make the 'AddressSize', 'SegmentSelectorSize' fields optional. This patch makes the 'AddressSize' and 'SegmentSelectorSize' fields of address range table optional. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D84907	2020-07-30 17:39:58 +08:00
Sam Tebbs	276ed5f7e4	[DAGCombiner] Fold sext_inreg of a masked load into a sign extended masked load This patch adds a DAG combine fold for a sext(masked_load) into a sign extended masked load. Differential Revision: https://reviews.llvm.org/D84332	2020-07-30 10:34:02 +01:00
Florian Hahn	1ac72a0774	[IPConstProp] Regenerate check lines. Preparation for D84447.	2020-07-30 09:52:16 +01:00
Kang Zhang	0037a5f894	[PHIElimination] Fix the killed flag for LowerPHINode() Summary: In the phi-node-elimination pass, we set the killed flag incorrectly. When we eliminate the PHI node, we replace the PHI with a copy for the incoming value. Before this patch, we will set incoming value as killed(PHICopy). And we will remove the killed flag from last using incoming value(OldKill). This is correct, only if the new PHICopy is after the OldKill. Reviewed By: bjope Differential Revision: https://reviews.llvm.org/D80886	2020-07-30 08:18:50 +00:00
David Sherwood	23ad660b5d	[SVE][CodeGen] At -O0 fallback to DAG ISel when translating alloca with scalable types When building code at -O0 We weren't falling back to DAG ISel correctly when encountering alloca instructions with scalable vector types. This is because the alloca has no operands that are scalable. I've fixed this by adding a check in AArch64ISelLowering::fallBackToDAGISel for alloca instructions with scalable types. Differential Revision: https://reviews.llvm.org/D84746	2020-07-30 08:40:53 +01:00
Craig Topper	07bb8240a0	[X86] Pass the OperandVector to ParseMemOperand instead of returning the operand. NFCI Continue the change made to ParseATTOperand to take the vector by reference. Let ParseMemOperand add its memory operand to the vector and just return true/false to indicate error.	2020-07-29 23:44:56 -07:00
Craig Topper	17597442db	[X86] Don't pass some many parameters to ParseMemOperand by reference. Pointers and SMLocs are cheap to copy. Even though the function modifies some of these the caller doesn't use them after the call.	2020-07-29 23:44:56 -07:00
Serge Pavlov	032ed39def	[Support] Class to facilitate file locking This change define RAII class `FileLocker` and methods `lock` and `tryLockFor` of the class `raw_fd_stream` to facilitate using file locks. Differential Revision: https://reviews.llvm.org/D79066	2020-07-30 13:42:20 +07:00

1 2 3 4 5 ...

201084 Commits