llvm-project

Commit Graph

Author	SHA1	Message	Date
Arthur Eubanks	c2590de30d	[docs][NewPM] Add docs for writing NPM passes As to not conflict with the legacy PM example passes under llvm/lib/Transforms/Hello, this is under HelloNew. This makes the CMakeLists.txt and general directory structure less confusing for people following the example. Much of the doc structure was taken from WritinAnLLVMPass.rst. This adds a HelloWorld pass which simply prints out each function name. More will follow after this, e.g. passes over different units of IR, analyses. https://llvm.org/docs/WritingAnLLVMPass.html contains a lot more. Reviewed By: ychen, asbirlea Differential Revision: https://reviews.llvm.org/D86979	2020-09-14 13:26:03 -07:00
Teresa Johnson	226d80ebe2	[MemProf] Rename HeapProfiler to MemProfiler for consistency This is consistent with the clang option added in `7ed8124d46`, and the comments on the runtime patch in D87120. Differential Revision: https://reviews.llvm.org/D87622	2020-09-14 13:14:57 -07:00
Craig Topper	4208ea3e19	[FastISel] Bail out of selectGetElementPtr for vector GEPs. The code that decomposes the GEP into ADD/MUL doesn't work properly for vector GEPs. It can create bad COPY instructions or possibly assert. For now just bail out to SelectionDAG. Fixes PR45906	2020-09-14 12:53:06 -07:00
Kamau Bridgeman	c0f199e566	[PowerPC] Implement Thread Local Storage Support for Local Exec This patch is the initial support for the Local Exec Thread Local Storage model to produce code sequence and relocations correct to the ABI for the model when using PC relative memory operations. Patch by: Kamau Bridgeman Differential Revision: https://reviews.llvm.org/D83404	2020-09-14 14:16:28 -05:00
Nikita Popov	53f36f06af	[Legalize][ARM][X86] Add float legalization for VECREDUCE This adds SoftenFloatRes, PromoteFloatRes and SoftPromoteHalfRes legalizations for VECREDUCE, to fill the remaining hole in the SDAG legalization. These legalizations simply expand the reduction and let it be recursively legalized. For the PromoteFloatRes case at least it is possible to do better than that, but it's pretty tricky (because we need to consider the interaction of three different vector legalizations and the type promotion) and probably not really worthwhile. I haven't added ExpandFloatRes support, as I am not familiar with ppc_fp128. Differential Revision: https://reviews.llvm.org/D87569	2020-09-14 20:42:09 +02:00
Eric Astor	23a2b03221	[ms] [llvm-ml] Add basic support for SEH, including PROC FRAME Add basic support for SEH, including PROC FRAME Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D86948	2020-09-14 14:32:55 -04:00
Eric Astor	20201dc76a	[ms] [llvm-ml] Add support for size queries in MASM Add support for size inference, sizeof, typeof, and lengthof. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D86947	2020-09-14 14:27:06 -04:00
Eric Astor	7c44ee8e19	[ms] [llvm-ml] Fix struct padding logic MASM structs are end-padded to have size a multiple of the smaller of the requested alignment and the size of their largest field (taken recursively, if they have a field of STRUCT type). This matches the behavior of ml.exe and ml64.exe. Our original implementation followed the MASM 6.0 documentation, which instead specified that MASM structs were padded to a multiple of their requested alignment. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D87248	2020-09-14 14:12:20 -04:00
Eric Astor	da17e0d5c1	[ms] [llvm-ml] Add missing built-in type aliases Add signed aliases for integral types, as well as the "DF" abbreviation for the FWORD type. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D87246	2020-09-14 14:09:24 -04:00
Nikita Popov	cfff88c03c	[InstCombine] Simplify select operand based on equality condition For selects of the type X == Y ? A : B, check if we can simplify A by using the X == Y equality and replace the operand if that's possible. We already try to do this in InstSimplify, but will only fold if the result of the simplification is the same as B, in which case the select can be dropped entirely. Here the select will be retained, just one operand simplified. As we are performing an actual replacement here, we don't have problems with refinement / poison values. Differential Revision: https://reviews.llvm.org/D87480	2020-09-14 20:07:06 +02:00
Nikita Popov	8e69c3cde8	[DAGCombiner] Fold fmin/fmax with INF / FLT_MAX Similar to D87415, this folds the various float min/max opcodes with a constant INF or -INF operand, or FLT_MAX / -FLT_MAX operand if the ninf flag is set. Some of the folds are only possible under nnan. The fminnum(X, INF) with nnan and fmaxnum(X, -INF) with nnan cases are needed to improve the VECREDUCE_FMIN/FMAX lowerings on X86, the rest is here for the sake of completeness. Differential Revision: https://reviews.llvm.org/D87571	2020-09-14 19:59:33 +02:00
Simon Pilgrim	4ff4708d39	collectBitParts - use const references. NFCI. Fixes clang-tidy warnings first noticed on D87452.	2020-09-14 18:23:00 +01:00
Rahman Lavaee	7841e21c98	Let -basic-block-sections=labels emit basicblock metadata in a new .bb_addr_map section, instead of emitting special unary-encoded symbols. This patch introduces the new .bb_addr_map section feature which allows us to emit the bits needed for mapping binary profiles to basic blocks into a separate section. The format of the emitted data is represented as follows. It includes a header for every function: \| Address of the function \| -> 8 bytes (pointer size) \| Number of basic blocks in this function (>0) \| -> ULEB128 The header is followed by a BB record for every basic block. These records are ordered in the same order as MachineBasicBlocks are placed in the function. Each BB Info is structured as follows: \| Offset of the basic block relative to function begin \| -> ULEB128 \| Binary size of the basic block \| -> ULEB128 \| BB metadata \| -> ULEB128 [ MBB.isReturn() OR MBB.hasTailCall() << 1 OR MBB.isEHPad() << 2 ] The new feature will replace the existing "BB labels" functionality with -basic-block-sections=labels. The .bb_addr_map section scrubs the specially-encoded BB symbols from the binary and makes it friendly to profilers and debuggers. Furthermore, the new feature reduces the binary size overhead from 70% bloat to only 12%. For more information and results please refer to the RFC: https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html Reviewed By: MaskRay, snehasish Differential Revision: https://reviews.llvm.org/D85408	2020-09-14 10:16:44 -07:00
Sanjay Patel	55d371abd7	[InstSimplify] add folds for fmin/fmax with 'nnan' maximum(nnan X, +INF) --> +INF minimum(nnan X, -INF) --> -INF This is based on the similar codegen transform proposed in: D87571	2020-09-14 11:46:11 -04:00
Sanjay Patel	7526376164	[InstSimplify] allow folds for fmin/fmax with 'ninf' maxnum(ninf X, +FLT_MAX) --> +FLT_MAX minnum(ninf X, -FLT_MAX) --> -FLT_MAX This is based on the similar codegen transform proposed in: D87571	2020-09-14 11:18:08 -04:00
Florian Hahn	c4f1b31441	[MemorySSA] Make sure PerformedPhiTrans is updated for each visited def. `1ce82015f6` added a fix to restrict phi optimizations after phi translations. But the current use of performedPhiTranslation only checked whether phi translation happened for the first iterator and missed cases where phi translations happens at subsequent iterators/upwards defs. This patch changes upward_defs_iteartor to take a pointer to a bool, so we can easily ensure the final value includes all visited defs, while still being able to conveniently use it with make_range & co.	2020-09-14 16:11:56 +01:00
Sanjay Patel	22c583c3d0	[InstSimplify] reduce code duplication for fmin/fmax folds; NFC We use the same code structure for folding integer min/max.	2020-09-14 10:32:11 -04:00
jasonliu	9868ea764f	[XCOFF][AIX] Handle TOC entries that could not be reached by positive range in small code model Summary: In small code model, AIX assembler could not deal with labels that could not be reached within the [-0x8000, 0x8000) range from TOC base. So when generating the assembly, we would need to help the assembler by subtracting an offset from the label to keep the actual value within [-0x8000, 0x8000). Reviewed By: hubert.reinterpretcast, Xiangling_L Differential Revision: https://reviews.llvm.org/D86879	2020-09-14 13:41:34 +00:00
Sanjay Patel	7bb9a2f996	[InstSimplify] fix miscompiles with maximum/minimum intrinsics As discussed in the sibling codegen functionality patch D87571, this transform was created with D52766, but it is not correct. The incorrect test diffs were missed during review, but the 'TODO' comment about this functionality was still in the code - we need 'nnan' to enable this fold.	2020-09-14 09:06:41 -04:00
Jay Foad	c799f873cb	[AMDGPU] Don't cluster stores Clustering loads has caching benefits, but as far as I know there is no advantage to clustering stores on any AMDGPU subtargets. The disadvantage is that it tends to increase register pressure and restricts scheduling freedom. Differential Revision: https://reviews.llvm.org/D85530	2020-09-14 13:40:17 +01:00
Simon Pilgrim	98eaacd73d	Assert we've found both vector types. NFCI. Fixes clang static analyzer warning about potential null dereferences.	2020-09-14 13:24:17 +01:00
Simon Pilgrim	7109fc9e42	Don't dereference from a dyn_cast<>. NFCI. Use cast<> instead which will assert if it fails and not just return null. Fixes clang static analyzer warning.	2020-09-14 13:05:17 +01:00
Max Kazantsev	412b417bfa	[NFC] Add missing `const` statements in SCEV	2020-09-14 18:43:24 +07:00
David Green	06fb4e9064	[CGP] Limit converting phi types to simple loads and stores Instcombine limits converting phi types to simple loads and stores. This does the same in codegenprepare, not processing phis that are not simple. Note that volatile loads/store ISel will happily convert between float and int. Atomics are more likely to always be integer. This just keeps things simple and doesn't process either. Differential Revision: https://reviews.llvm.org/D83770	2020-09-14 12:08:34 +01:00
Florian Hahn	f715d81c9d	[DSE] Only eliminate candidates that always store the same loc. AliasAnalysis/MemoryLocation does not account for loops. Two MemoryLocation can be must-overwrite, even if the first one writes multiple locations in a loop. This patch prevents removing such stores, by only considering candidates that are known to be loop invariant, or executed in the same BB. Currently the invariant check is quite conservative and only considers Alloca and Alloca-like instructions and arguments as invariant base pointers. It also considers GEPs with all constant indices and invariant bases as invariant. This can be improved in the future, but the current implementation has only minor impact on the total number of stores eliminated (25903 vs 26047 for the baseline). There are some 2-10% swings for some individual benchmarks. In roughly half of the cases, the number of stores removed increases actually, because we skip candidates that are unlikely to be valid candidates early.	2020-09-14 12:06:58 +01:00
Meera Nakrani	dd519bf0b0	[ARM] Selects SSAT/USAT from correct LLVM IR LLVM will canonicalize conditional selectors to a different pattern than the old code that was used. This is updating the function to match the new expected patterns and select SSAT or USAT when successful. Tests have also been updated to use the new patterns. Differential Review: https://reviews.llvm.org/D87379	2020-09-14 10:58:21 +00:00
Sjoerd Meijer	676febc044	[ARM][MVE] Tail-predication: check get.active.lane.mask's TC value This adds additional checks for the original scalar loop tripcount value, i.e. get.active.lane.mask second argument, and perform several sanity checks to see if it is of the form that we expect similarly like we already do for the IV which is the first argument of get.active.lane. Differential Revision: https://reviews.llvm.org/D86074	2020-09-14 11:32:15 +01:00
David Sherwood	816663adb5	[SVE] In LoopIdiomRecognize::isLegalStore bail out for scalable vectors The function LoopIdiomRecognize::isLegalStore looks for stores in loops that could be transformed into memset or memcpy. However, the algorithm currently requires that we know how big the store is at runtime, i.e. that the store size will not overflow an unsigned integer. For scalable vectors we cannot guarantee this so I have changed the code to bail out for now. In addition, even if we add a way to query the maximum value of vscale in future we will still need to update the algorithm to cope with non-constant strides. The additional cost associated with calculating the memset and memcpy arguments will need to be taken into account as well. This patch also fixes up an implicit TypeSize -> uint64_t cast, thereby removing a warning. I've added tests here showing a fixed width vector loop being transformed into memcpy, and a scalable vector loop remaining unchanged: Transforms/LoopIdiom/memcpy-vectors.ll Differential Revision: https://reviews.llvm.org/D87439	2020-09-14 11:28:31 +01:00
Petar Avramovic	6e2a86ed5a	AMDGPU/GlobalISel Check for NoNaNsFPMath in isKnownNeverSNaN Check for NoNaNsFPMath function attribute in isKnownNeverSNaN. Function attributes are in held in 'TargetMachine.Options'. Among other things, this allows selection of some patterns imported in D87351 since G_FCANONICALIZE is not generated when isKnownNeverSNaN returns true in lowerFMinNumMaxNum. However we notice some incorrect results since function attributes are not correctly written in TargetMachine.Options when next function is processed. Take a look at @v_test_no_global_nnans_med3_f32_pat0_srcmod0, it has "no-nans-fp-math"="false" but TargetMachine.Options still has it set to true since first function in test file had this attribute set to true. This will be fixed in D87511. Differential Revision: https://reviews.llvm.org/D87456	2020-09-14 12:11:00 +02:00
Simon Pilgrim	00e5676cf6	[LegalizeDAG] Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI.	2020-09-14 11:09:43 +01:00
Jeremy Morse	d3af441dfe	[DebugInstrRef][1/9] Add fields for instr-ref variable locations Add a DBG_INSTR_REF instruction and a "debug instruction number" field to MachineInstr. The two allow variable values to be specified by identifying where the value is computed, rather than the register it lies in, like so: %0 = fooinst, debug-instr-number 1 [...] DBG_INSTR_REF 1, 0 See the original RFC for motivation: http://lists.llvm.org/pipermail/llvm-dev/2020-February/139440.html This patch is NFCI; it only adds fields and other boiler plate. Differential Revision: https://reviews.llvm.org/D85741	2020-09-14 10:06:52 +01:00
Petar Avramovic	09b8871f8d	AMDGPU/GlobalISel/Emitter Support for predicate code that uses operands Predicates with 'let PredicateCodeUsesOperands = 1' want to examine matched operands. When we encounter predicate code that uses operands, analyze its named operand arguments and create a map between argument index and name. Later, when leaf node with name is encountered, emit GIM_RecordNamedOperand that will store that operand at its argument index in operand list. This operand list will be an argument to c++ code of the predicate. Differential Revision: https://reviews.llvm.org/D87285	2020-09-14 10:39:56 +02:00
David Stenberg	bfcb824ba5	[JumpThreading] Fix an incorrect Modified status This fixes PR47297. When ProcessBlock() was able to constant fold the terminator's condition, but not do any more transformations, the function would return false, which would lead to the JumpThreading pass returning an incorrect modified status. This patch makes so that ProcessBlock() returns true in such cases. This will trigger an unnecessary invocation of ProcessBlock() in such cases, but this should be rare to occur. This was caught using the check introduced by D80916. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87392	2020-09-14 10:36:13 +02:00
Jay Foad	9a4476072e	[UnifyLoopExits] Fix non-deterministic iteration order This was causing random minor codegen differences in shaders compiled with the AMDGPU backend. Differential Revision: https://reviews.llvm.org/D87548	2020-09-14 09:09:58 +01:00
Simon Wallis	4946802c5f	[ARM] Fix so immediates and pc relative checks Treating an SoImm offset as a multiple of 4 between -1020 and 1020 mis-handles the second of a pair of 16-bit constants where the offset is a multiple of 2 but not a multiple of 4, leading to an LLVM ERROR: out of range pc-relative fixup value For 32-bit and larger (64-bit) constants, continue to treat an SoImm offset as a multiple of 4 between -1020 and 1020. For smaller (16-bit) constants, treat an SoImm offset as a multiple of 1 between -255 and 255. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D86949	2020-09-14 08:52:59 +01:00
David Sherwood	15bff4dec4	[CodeGen] Fix bug in IncrementPointer In an earlier patch I meant to add the correct flags to the ADD node when incrementing the pointer, but forgot to pass them to SelectionDAG::getNode. Differential Revision: https://reviews.llvm.org/D87496	2020-09-14 08:03:55 +01:00
Fangrui Song	4d7b194543	[llvm-cov gcov] Refactor counting and reporting The current organization of FileInfo and its referenced utility functions of (GCOVFile, GCOVFunction, GCOVBlock) is messy. Some members of FileInfo are just copied from GCOVFile. FileInfo::print (.gcov output and --intermediate output) is interleaved with branch statistics and computation of line execution counts. --intermediate has to do redundant .gcov output to gather branch statistics. This patch deletes lots of code and introduces a clearer work flow: ``` fn collectFunction for each block b for each line lineNum let line be LineInfo of the file on lineNum line.exists = 1 increment function's lines & linesExec if necessary increment line.count line.blocks.push_back(&b) fn collectSourceLine compute cycle counts count = incoming_counts + cycle_counts if line.exists ++summary->lines if line.count ++summary->linesExec fn collectSource for each line call collectSourceLine fn main for each function call collectFunction print function summary for each source file call collectSource print file summary annotate the source file with line execution counts if -i print intermediate file ``` The output order of functions and files now follows the original order in .gcno files.	2020-09-13 23:00:59 -07:00
Yevgeny Rouban	88690a9658	[CodeGenPrepare] Fix zapping dead operands of assume This patch fixes a problem of the commit `52cc97a0`. A test case is created to demonstrate the crash caused by the instruction iterator invalidated by the recursive removal of dead operands of assume. The solution restarts from the blocks's first instruction in case CurInstIterator is invalidated by RecursivelyDeleteTriviallyDeadInstructions(). Reviewed By: bkramer Differential Revision: https://reviews.llvm.org/D87434	2020-09-14 11:46:34 +07:00
Craig Topper	56b33391d3	[SelectionDAG] Move ISD:PARITY formation from DAGCombine to SimplifyDemandedBits. Previously, we formed ISD::PARITY by looking for (and (ctpop X), 1) but the AND might be separated from the ctpop. For example if the parity result is multiplied by 2, we'll pull the AND through the shift. So to handle more cases, move to SimplifyDemandedBits where we can handle more cases that result in only the LSB of the CTPOP being used.	2020-09-13 21:04:13 -07:00
Lang Hames	783ba64a89	[JITLink] Improve formatting for Edge, Block and Symbol debugging output.	2020-09-13 15:44:07 -07:00
Fangrui Song	b2c32c90ba	[llvm-cov gcov] Add -r (--relative-only) && -s (--source-prefix) gcov 4.7 introduced the two options. https://sourceware.org/pipermail/gcc-patches/2011-November/328782.html -r only dumps files with relative paths or absolute paths with the prefix specified by -s. The two options are useful filtering out system header files.	2020-09-13 14:54:20 -07:00
David Blaikie	ce89eeee16	PPCInstrInfo: Fix readability-inconsistent-declaration-parameter-name clang-tidy warning Reduces the chance of confusion when calling the function with autocomplete (will show the more accurate/informative variable name), etc.	2020-09-13 13:08:17 -07:00
David Blaikie	6e06f1cd08	GCOVProfiling: Avoid use-after-move Turns out this was use-after-move of function_ref, which is trivially copyable and movable, so the move did nothing and use after move was safe. But since this function_ref is being copied into a std::function, change the function_ref to be std::function to avoid extra layers of type erasure indirection - and then it's a real use after move, and fix that by referring to the moved-to member variable rather than the moved-from parameter.	2020-09-13 12:54:36 -07:00
Qiu Chaofan	a4c5351986	[DAGCombiner] Propagate FMF flags in FMA folding DAG combiner folds (fma a 1.0 b) into (fadd a b) but the flag isn't propagated into new fadd. This patch fixes that. Some code in visitFMA is redundant and such support for vector constants is missing. Need follow-up patch to clean. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D87037	2020-09-14 00:19:06 +08:00
David Green	9237fde481	[CGP] Prevent optimizePhiType from iterating forever The recently added optimizePhiType algorithm had no checks to make sure it didn't continually iterate backward and forth between float and int types. This means that given an input like store(phi(bitcast(load))), we could convert that back and forth to store(bitcast(phi(load))). This particular case would usually have been simplified to a different load type (folding the bitcast into the load) before CGP, but other cases can occur. The one that came up was phi(bitcast(phi)), where the two phi's of different types were bitcast between. That was not helped by a dead bitcast being kept around which could make conversion look profitable. This adds an extra check of the bitcast Uses or Defs, to make sure that at least one is grounded and will not end up being converted back. It also makes sure that dead bitcasts are removed, and there is a minor change to include newly created Phi nodes in the Visited set so that they do not need to be revisited. Differential Revision: https://reviews.llvm.org/D82676	2020-09-13 16:11:01 +01:00
Qiu Chaofan	bec81dc67d	Reland "[PowerPC] Implement instruction clustering for stores" Commit `3c0b3250` introduced store fusion for PowerPC target, but it brought failure under UB sanitizer and was reverted. This patch fixes them.	2020-09-13 19:51:01 +08:00
Fangrui Song	5f4e9bf641	[gcov] Fix memory leak due to BranchProbabilityInfoWrapperPass This is weird.	2020-09-13 00:44:32 -07:00
Fangrui Song	63182c2ac0	[gcov] Add spanning tree optimization gcov is an "Edge Profiling with Edge Counters" application according to Optimally Profiling and Tracing Programs (1994). The minimum number of counters necessary is \|E\|-(\|V\|-1). The unmeasured edges form a spanning tree. Both GCC --coverage and clang -fprofile-generate leverage this optimization. This patch implements the optimization for clang --coverage. The produced .gcda files are much smaller now.	2020-09-13 00:07:31 -07:00
Fangrui Song	f086e85eea	[gcov] Assign names to some types and loaded values used in @__llvm_internal* This makes the generated IR much more readable.	2020-09-12 22:42:37 -07:00
Fangrui Song	8cf1ac97ce	[llvm-cov gcov] Improve accuracy when some edges are not measured Also guard against infinite recursion if GCOV_ARC_ON_TREE edges contain a cycle.	2020-09-12 22:33:41 -07:00

1 2 3 4 5 ...

138936 Commits