llvm-project

Commit Graph

Author	SHA1	Message	Date
Slava Zakharin	b163ac33bd	[mlir][math] Lower atan to libm Differential Revision: https://reviews.llvm.org/D128454	2022-06-23 10:49:25 -07:00
Sam McCall	7aff663b2a	[pseudo] Store reduction sequences by pointer in heaps, instead of by value. Copying sequences around as the heap resized is significantly expensive. This speeds up glrParse by ~35% (2.4 => 3.25 MB/s) Differential Revision: https://reviews.llvm.org/D128307	2022-06-23 19:41:11 +02:00
Matthias Springer	3474d10e1a	[mlir][bufferization][NFC] Make `escape` a dialect attribute All bufferizable ops that bufferize to an allocation receive a `bufferization.escape` attribute during TensorCopyInsertion. Differential Revision: https://reviews.llvm.org/D128137	2022-06-23 19:34:47 +02:00
Peter Klausler	b6fce8b92d	[flang] Fix bogus errors from SIZE/SHAPE/UBOUND on assumed-shape While it is indeed an error to use SIZE, SHAPE, or UBOUND on an assumed-shape dummy argument without also supplying a DIM= argument to the intrinsic function, it is not an error to use these intrinsic functions on sections or expressions of such arrays. Refine the test used for the error message. Differential Revision: https://reviews.llvm.org/D128391	2022-06-23 10:32:22 -07:00
Sam McCall	3e610f2cdc	[pseudo] Turn glrReduce into a class, reuse storage across calls. This is a ~5% speedup, we no longer have to allocate the priority queues and other collections for each reduction step where we use them. It's also IMO easier to understand the structure of a class with methods vs a function with nested lambdas. Differential Revision: https://reviews.llvm.org/D128301	2022-06-23 19:27:47 +02:00
Philip Reames	1cc9792281	[RISCV] Fix a crash in InsertVSETVLI where we hadn't properly guarded for a SEWLMULRatioOnly abstract state A forward abstract state can be in the special SEWLMULRatioOnly state which means we're not allowed to inspect its fields. The scalar to vector move case was mising a guard, and we'd crash on an assert. Test cases included.	2022-06-23 10:25:16 -07:00
Florian Hahn	d9526e8a52	[ConstraintElimination] Use stable_sort to sort worklist. If there are multiple constraints in the same block, at the moment the order they are processed may be different depending on the sort implementation. Use stable_sort to ensure consistent ordering.	2022-06-23 19:22:15 +02:00
Joseph Huber	6e6889288c	[Offloading] Embed the target features in the OffloadBinary The target features are necessary for correctly compiling most programs in LTO mode. Currently, these are derived in clang at link time and passed as an arguemnt to the linker wrapper. This is problematic because it requires knowing the required toolchain at link time, which should not be necessry. Instead, these features should be embedded into the offloading binary so we can unify them in the linker wrapper for LTO. This also required changing the offload packager to interpret multiple arguments as concatenation with a comma. This is so we can still use the `,` separator for the argument list. Depends on D127246 Reviewed By: tra Differential Revision: https://reviews.llvm.org/D127686	2022-06-23 13:15:01 -04:00
Arthur Eubanks	865812c3af	[docs][NewPM] Add more info on why accessing mutable outer analyses is disallowed Reviewed By: asbirlea, rnk Differential Revision: https://reviews.llvm.org/D128374	2022-06-23 10:05:37 -07:00
Peter Klausler	ede4213169	[flang][runtime] Handle READ of non-UTF-8 data into multi-byte CHARACTER When a READ statement reads into a CHARACTER(2 or 4) variable from a unit whose encoding is not UTF-8, don't copy bytes directly; they must each be zero-extended. Differential Revision: https://reviews.llvm.org/D128390	2022-06-23 10:02:14 -07:00
Arthur Eubanks	b257acd266	[test][GlobalOpt] Update precommitted test	2022-06-23 09:56:31 -07:00
Peter Klausler	1650fb8a53	[flang][runtime] Respect PAD='NO' on READ/WRITE The check for the PAD= setting should examine the mutable modes of the current I/O statement, not the persistent modes of the I/O unit. Differential Revision: https://reviews.llvm.org/D128389	2022-06-23 09:50:22 -07:00
gpetters94	bc07634b5a	Adding a named op for grouped convolutions	2022-06-23 16:32:22 +00:00
Sam McCall	f9710d1908	[pseudo] Add a fast-path to GLR reduce when both pop and push are trivial In general we split a reduce into pop/push, so concurrently-available reductions can run in the correct order. The data structures for this are expensive. When only one reduction is possible at a time, we need not do this: we can pop and immediately push instead. Strictly this is correct whenever we yield one concurrent PushSpec. This patch recognizes a trivial but common subset of these cases: - there must be no pending pushes and only one head available to pop - the head must have only one reduction rule - the reduction path must be a straight line (no multiple parents) On my machine this speeds up by 2.12 -> 2.30 MB/s = 8% Differential Revision: https://reviews.llvm.org/D128299	2022-06-23 18:21:59 +02:00
Sam McCall	b70ee9d984	Reland "[pseudo] Track heads as GSS nodes, rather than as "pending actions"." This reverts commit `2c80b53198`. Fixes LRTable::buildForTest to create states that are referenced but have no actions.	2022-06-23 18:21:44 +02:00
Peter Klausler	d771245a9d	[flang] Fix READ/WRITE with POS= on stream units, with refactoring First, ExternalFileUnit::SetPosition was being used both as a utility within the class' member functions as well as an API from I/O statement processing. Make it private, and add APIs for SetStreamPos and SetDirectRec. Second, ensure that SetStreamPos for POS= positioning in a stream doesn't leave the current record number and endfile record number in an arbitrary state. In stream I/O they are used only to manage end-of-file detection, and shouldn't produce false positive results from IsAtEnd() after repositioning. Differential Revision: https://reviews.llvm.org/D128388	2022-06-23 09:16:49 -07:00
Sam McCall	2c80b53198	Revert "[pseudo] Track heads as GSS nodes, rather than as "pending actions"." This reverts commit `e3ec054dfd`. Tests fail in asserts mode: https://lab.llvm.org/buildbot/#/builders/109/builds/41217	2022-06-23 18:16:38 +02:00
Philip Reames	0c1326748f	[BasicTTI] Avoid crash when costing scalable select expansion If the target has chosen to expand a scalable vector type, BasicTTI tries to scalarize and we'd crash. As a minimum, we should return an invalid cost instead. The added test provide coverage for the moment, but given they show a number of gaps in RISCV costing, they're likely not to cover this code path long term.	2022-06-23 09:14:57 -07:00
Jonas Devlieghere	70841b97eb	[lldb] Make thread safety the responsibility of the log handlers Drop the thread-safe flag and make the locking strategy the responsibility of the individual log handler. Previously we got away with a non-thread safe mode because we were using unbuffered streams that rely on the underlying syscalls/OS for synchronization. With the introduction of log handlers, we can have arbitrary logic involved in writing out the logs. With this patch the log handlers can pick the most appropriate locking strategy for their particular implementation. Differential revision: https://reviews.llvm.org/D127922	2022-06-23 09:12:05 -07:00
Jonas Devlieghere	09dea54669	[lldb] Support a buffered logging mode This patch adds a buffered logging mode to lldb. A buffer size can be passed to `log enable` with the -b flag. If no buffer size is specified, logging is unbuffered. Differential revision: https://reviews.llvm.org/D127986	2022-06-23 09:12:01 -07:00
Valentin Clement	734ad031f1	[flang] Handle boxed characters that are values when doing a conversion Character conversion requires memory storage as it operates on a sequence of code points. This patch is part of the upstreaming effort from fir-dev branch. Reviewed By: PeteSteinfeld Differential Revision: https://reviews.llvm.org/D128438 Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>	2022-06-23 18:05:24 +02:00
Val Donaldson	124338dd80	[flang] Increase support for intrinsic module procedures * Make Semantics test doconcurrent01.f90 an expected failure pending a fix for a problem in recognizing a PURE prefix specifier for a specific procedure that occurs in new intrinsic module source code, * review update * review update * Increase support for intrinsic module procedures The f18 standard defines 5 intrinsic modules that define varying numbers of procedures, including several operators: 2 iso_fortran_env 55 ieee_arithmetic 10 ieee_exceptions 0 ieee_features 6 iso_c_binding There are existing fortran source files for each of these intrinsic modules. This PR adds generic procedure declarations to these files for procedures that do not already have them, together with associated specific procedure declarations. It also adds the capability of recognizing intrinsic module procedures in lowering code, making it possible to use existing language intrinsic code generation for intrinsic module procedures for both scalar and elemental calls. Code can then be generated for intrinsic module procedures using existing options, including front end folding, direct inlining, and calls to runtime support routines. Detailed code generation is provided for several procedures in this PR, with others left to future PRs. Procedure calls that reach lowering and don't have detailed implementation support will generate a "not yet implemented" message with a recognizable name. The generic procedures in these modules may each have as many as 36 specific procedures. Most specific procedures are generated via macros that generate type specific interface declarations. These specific declarations provide detailed argument information for each individual procedure call, similar to what is done via other means for standard language intrinsics. The modules only provide interface declarations. There are no procedure definitions, again in keeping with how language intrinsics are processed. This patch is part of the upstreaming effort from fir-dev branch. Reviewed By: jeanPerier, PeteSteinfeld Differential Revision: https://reviews.llvm.org/D128431 Co-authored-by: V Donaldson <vdonaldson@nvidia.com>	2022-06-23 18:03:48 +02:00
LLVM GN Syncbot	57b0d940d5	[gn build] Port `4045b62d4c`	2022-06-23 15:49:40 +00:00
Craig Topper	8b10ffabae	[RISCV] Disable <vscale x 1 x > types with Zve32x or Zve32f. According to the vector spec, mf8 is not supported for i8 if ELEN is 32. Similarily mf4 is not suported for i16/f16 or mf2 for i32/f32. Since RVVBitsPerBlock is 64 and LMUL is calculated as ((MinNumElements ElementSize) / RVVBitsPerBlock) this means we need to disable any type with MinNumElements==1. For generic IR, these types will now be widened in type legalization. For RVV intrinsics, we'll probably hit a fatal error somewhere. I plan to work on disabling the intrinsics in the riscv_vector.h header. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D128286	2022-06-23 08:49:18 -07:00
Nico Weber	0ec87addb7	[lld/mac] Add a few TimeTraceScopes Identical literal folding takes ~1.4% of the time, and was missing from the trace. Signature computation still needs ~2.2% of the time, so probably worth explicitly marking its contribution to "Write output file" (9.1%) Differential Revision: https://reviews.llvm.org/D128343	2022-06-23 11:46:57 -04:00
Craig Topper	4045b62d4c	[RISCV] Add macrofusion infrastructure and one example usage. This adds the macrofusion plumbing and support fusing LUI+ADDI(W). This is similar to D73643, but handles a different case. Other cases can be added in the future. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D128393	2022-06-23 08:38:39 -07:00
Joe Nash	ae72fee74e	[AMDGPU] gfx11 Select on Buffer Atomic FAdd Rtn type Reviewed By: #amdgpu, foad, rampitec Differential Revision: https://reviews.llvm.org/D128205	2022-06-23 11:05:32 -04:00
Florian Hahn	94ed2caf70	Revert "[ConstraintElimination] Transfer info from ULT to signed system." This reverts commit `316e106f49`. This breaks a bot with expensive checks.	2022-06-23 17:27:33 +02:00
Sam McCall	e3ec054dfd	[pseudo] Track heads as GSS nodes, rather than as "pending actions". IMO this model is simpler to understand (borrowed from the LR0 patch D127357). It also makes error recovery easier to implement, as we have a simple list of head nodes lying around to recover from when needed. (It's not quite as nice as LR0 in this respect though). It's slightly slower (2.24 -> 2.12 MB/S on my machine = 5%) but nothing close to as bad as LR0. However - I think we'd have to eat a litle performance loss otherwise to implement error recovery. - this frees up some complexity budget for optimizations like fastpath push/pop (this + fastpath is already faster than head) - I haven't changed the data structure here and it's now pretty dumb, we can make it faster Differential Revision: https://reviews.llvm.org/D128297	2022-06-23 17:26:42 +02:00
Mark de Wever	9afaa158f5	[libc++][format] Copy code to new location. This is a helper patch to ease the reviewing of D128139. The originals will be removed at a later time when all formatters are converted to the new style. (Floating-point and pointer aren't up for review yet.) Reviewed By: #libc, ldionne Differential Revision: https://reviews.llvm.org/D128367	2022-06-23 17:21:37 +02:00
Florian Hahn	316e106f49	[ConstraintElimination] Transfer info from ULT to signed system. If A u< B holds, then A s>= 0 && A s< B holds if B s>= 0. https://alive2.llvm.org/ce/z/RrNxHh	2022-06-23 17:17:01 +02:00
Jan Svoboda	9ec7e4df57	[clang][driver] NFC, test: Make test output order-independent	2022-06-23 17:15:28 +02:00
Daniel Bertalan	ed39fd515a	[lld-macho] Use source information in duplicate symbol errors Similarly to how undefined symbol diagnostics were changed in D128184, we now show where in the source file duplicate symbols are defined at: ld64.lld: error: duplicate symbol: _foo >> defined in bar.c:42 >> /path/to/bar.o >> defined in baz.c:1 >> /path/to/libbaz.a(baz.o) For objects that don't contain DWARF data, the format is unchanged. A slight difference to undefined symbol diagnostics is that we don't print the name of the symbol on the third line, as it's already contained on the first line. Differential Revision: https://reviews.llvm.org/D128425	2022-06-23 11:07:15 -04:00
Bradley Smith	6f27df5084	[AArch64][SVE] Match (add x (lsr/asr y c)) -> usra/ssra x y c Differential Revision: https://reviews.llvm.org/D128045	2022-06-23 14:56:21 +00:00
Baptiste Saleil	79e77a9f39	[AMDGPU] Flush the vmcnt counter in loop preheaders when necessary waitcnt vmcnt instructions are currently generated in loop bodies before using values loaded outside of the loop. In some cases, it is better to flush the vmcnt counter in a loop preheader before entering the loop body. This patch detects these cases and generates waitcnt instructions to flush the counter. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D115747	2022-06-23 10:53:21 -04:00
Nico Weber	851a5efe45	Revert "[fastalloc] Support allocating specific register class in fastalloc" This reverts commit `719658d078`. Breaks a few things, see comments on https://reviews.llvm.org/D128437 There's disagreement about the best fix. So let's keep HEAD green while discussions are happening.	2022-06-23 10:44:24 -04:00
Joseph Huber	7c9a3825b8	[Binary] Fix leftoever line	2022-06-23 10:36:25 -04:00
Joseph Huber	4e2a0092b9	[Binary] Reserve the correct size for the OffloadBinary Summary: When writing the offload binary, we use a SmallVector. We already know the size that we expect the buffer to take up so we should reserve all that memory up-front to improve performance. Also this patch adds some extra sanity checks for the binary format for safety.	2022-06-23 10:35:29 -04:00
Nikita Popov	8b6f69a4da	[BasicAA] Add test for call incorrectly treated as escape source (NFC)	2022-06-23 16:30:30 +02:00
David Green	bd1a4c8565	[ValueTracking] Teach isKnownNonZero that a vscale is never 0. A llvm.vscale will always be at least 1, never zero. Teaching that to isKnownNonZero can help fold away some statically known compares. Differential Revision: https://reviews.llvm.org/D128217	2022-06-23 15:25:24 +01:00
Ilya Biryukov	342e64979a	[Sema] Fix assertion failure when instantiating requires expression Fixes #54629. The crash is is caused by the double template instantiation. See the added test. Here is what happens: - Template arguments for the partial specialization get instantiated. - This causes instantitation into the corrensponding requires expression. - `TemplateInsantiator` correctly handles instantiation of parameters inside `RequiresExprBody` and instantiates the constraint expression inside the `NestedRequirement`. - To build the substituted `NestedRequirement`, `TemplateInsantiator` calls `Sema::BuildNestedRequirement` calls `CheckConstraintSatisfaction`, which results in another template instantiation (with empty template arguments). This seem to be an implementation detail to handle constraint satisfaction and is not required by the standard. - The recursive template instantiation tries to find the parameter inside `RequiresExprBody` and fails with the corresponding assertion. Note that this only happens as both instantiations happen with the class partial template specialization set as `Sema.CurContext`, which is considered a dependent `DeclContext`. To fix the assertion, avoid doing the recursive template instantiation and instead evaluate resulting expressions in-place. Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D127487	2022-06-23 16:20:30 +02:00
Florian Hahn	9d2349c78f	[LSR] Move transform test from test/Analysis to test/Transforms. Also auto-generate check lines.	2022-06-23 16:04:45 +02:00
Jay Foad	2b4931ef8a	[AMDGPU] Use -check-prefixes in a test. NFC.	2022-06-23 14:59:44 +01:00
Florian Hahn	9a33f3975e	[ConstraintElimination] Transfer info from SLT to unsigned system. If A s< B holds, then A u< also holds, if A s>= 0. https://alive2.llvm.org/ce/z/J4JZuN	2022-06-23 15:57:59 +02:00
chenglin.bi	30e49a3794	[InstCombine] Optimise shift+and+boolean conversion pattern to simple comparison if (`C1` is pow2) & (`(C2 & ~(C1-1)) + C1)` is pow2): ((C1 << X) & C2) == 0 -> X >= (Log2(C2+C1) - Log2(C1)); https://alive2.llvm.org/ce/z/EJAl1R ((C1 << X) & C2) != 0 -> X < (Log2(C2+C1) - Log2(C1)); https://alive2.llvm.org/ce/z/3bVRVz And remove dead code. Fix: https://github.com/llvm/llvm-project/issues/56124 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D126591	2022-06-23 21:53:07 +08:00
Sam McCall	6b187fdf3b	[pseudo] Add xfail tests for a simple-declaration/function-definition ambiguity I expect to eliminate this ambiguity at the grammar level by use of guards, because it interferes with brace-based error recvoery. Differential Revision: https://reviews.llvm.org/D127400	2022-06-23 15:52:22 +02:00
Rodrigo Dominguez	971fa4b196	[AMDGPU] GFX11: remove ShaderType from ds_ordered_count offset field In GFX11 ShaderType is determined by the hardware and should no longer be written into bits[3:2] of the ds_ordered_count offset field. Differential Revision: https://reviews.llvm.org/D128196	2022-06-23 14:20:33 +01:00
Jay Foad	74c3f9c191	[AMDGPU] Precommit test for D128196	2022-06-23 14:15:45 +01:00
Ruiling Song	49b8ca3f7c	AMDGPU: Don't crash on global_ctor/dtor declaration Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D128320	2022-06-23 21:04:54 +08:00
Valentin Clement	ab89c132b5	[flang] Add lowering TODO for separate module procedures MODULE FUNCTION and MODULE SUBROUTINE currently cause lowering crash: "symbol is not mapped to any IR value" because special care is needed to handle their interface. Add a TODO for now. Example of program that crashed and will hit the TODO: ``` module mod interface module subroutine sub end subroutine end interface contains module subroutine sub x = 42 end subroutine end module ``` This patch is part of the upstreaming effort from fir-dev branch. Reviewed By: jeanPerier Differential Revision: https://reviews.llvm.org/D128412 Co-authored-by: Jean Perier <jperier@nvidia.com>	2022-06-23 14:57:58 +02:00

1 2 3 4 5 ...

427746 Commits All Branches Search

427746 Commits

All Branches