llvm-project

Commit Graph

Author	SHA1	Message	Date
Kazu Hirata	e0039b8d6a	Use llvm::less_second (NFC)	2022-06-04 22:48:32 -07:00
Kazu Hirata	9a8e65de8c	[Target] Use MachineBasicBlock::erase (NFC)	2022-06-04 22:41:24 -07:00
Kazu Hirata	bcf4fa458a	[CodeGen] Use a range-based for loop (NFC)	2022-06-04 22:26:55 -07:00
Kazu Hirata	8cc9fa6f78	Use static_cast from SmallString to std::string (NFC)	2022-06-04 22:09:27 -07:00
Kazu Hirata	4969a6924d	Use llvm::less_first (NFC)	2022-06-04 21:23:18 -07:00
Kazu Hirata	32ce076d78	[CodeGen] Use StringRef::contains (NFC)	2022-06-04 20:58:58 -07:00
Kazu Hirata	f83a88a179	[Transforms] Use llvm::is_contained (NFC)	2022-06-04 20:48:26 -07:00
LemonBoy	700eadca5f	[SPARC] Fix type for i64 inline asm operands Differential Revision: https://reviews.llvm.org/D101694	2022-06-04 18:32:16 -04:00
Florian Hahn	416a5080d8	[VPlan] Update vector latch terminator edge to exit block after execution. Instead of setting the successor to the exit using CFG.ExitBB, set it to nullptr initially. The successor to the exit block is later set either through createEmptyBasicBlock or after VPlan execution (because at the moment, no block is created by VPlan for the exit block, the existing one is reused). This also enables BranchOnCond to be used as terminator for the exiting block of the topmost vector region. Depends on D126618. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D126679	2022-06-04 21:22:32 +01:00
Jacques Pienaar	29794ab0fa	[mlir] Use context provided rather than getContext Avoids "pass state was never initialized" assertion failure.	2022-06-04 12:18:51 -07:00
Peter Klausler	03c066ab13	[flang][runtime] Catch OPEN of connected file Diagnose OPEN(FILE=f) when f is already connected by the same name to a distinct external I/O unit. Differential Revision: https://reviews.llvm.org/D127035	2022-06-04 11:06:37 -07:00
Peter Klausler	562fd2c99b	[flang][runtime] Emit error message rather than crashing for MOD(ULO)(x,P=0) Add extra arguments and checks to the runtime support library so that a call to the intrinsic functions MOD and MODULO with "denominator" argument P of zero will cause a crash with a source location rather than an uninformative floating-point error or integer division by zero signal. Additional work is required in lowering to (1) pass source file path and source line number arguments and (2) actually call these runtime library APIs instead of emitting inline code for MOD &/or MODULO. Differential Revision: https://reviews.llvm.org/D127034	2022-06-04 11:02:48 -07:00
Peter Klausler	11f928af9b	[flang][runtime] Fix deadlock in error recovery When an external I/O statement is in a recoverable error state before any data transfers take place (for example, an unformatted transfer with ERR=/IOSTAT=/IOMSG= attempted on a formatted unit), ensure that the unit's mutex is still released at the end of the statement. Differential Revision: https://reviews.llvm.org/D127032	2022-06-04 09:55:53 -07:00
Peter Klausler	ed71a0b45b	[flang] When folding FINDLOC, convert operands to a common type For example, FINDLOC(A,X) should convert both A and X to COMPLEX(8) if the operands are REAL(8) and COMPLEX(4), so that comparisons can be done without losing inforation. The current implementation unconditionally converts X to the type of the array A. Differential Revision: https://reviews.llvm.org/D127030	2022-06-04 09:26:13 -07:00
Peter Klausler	9a163ffe1a	[flang][runtime] Fix WRITE after OPEN(.., ACCESS="APPEND") The initial size of the file was not being captured as the file position on which the first output buffer should be framed. Differential Revision: https://reviews.llvm.org/D127029	2022-06-04 09:18:25 -07:00
Peter Klausler	dfcccc6dee	[flang][runtime] Fix edge case discrepancies with EN output editing The "engineering" ENw.d output editing descriptor has some difficult edge case behavior for values that might format into a bunch of 9's or round up to a 1 for a given scale factor. Fix the algorithm, and add tests to protect against regressions. Differential Revision: https://reviews.llvm.org/D127028	2022-06-04 09:14:05 -07:00
Peter Klausler	d484fe93d4	[flang] Don't crash on initialization with a zero-sized derived type Avoid calls to memcpy with zero byte counts if their address argument calculations may not be valid expressions. Differential Revision: https://reviews.llvm.org/D127027	2022-06-04 08:58:16 -07:00
Peter Klausler	ea5b205bb8	[flang][runtime] Don't crash after surviving internal output overflow After the program has survived its attempt to overflow the output buffer with an internal WRITE using ERR=, IOSTAT=, &/or IOMSG=, don't crash by accidentally blank-filling the next record that usually doesn't exist. Differential Revision: https://reviews.llvm.org/D127024	2022-06-04 08:47:13 -07:00
Peter Klausler	ea1a69d66d	[flang][runtime] Don't let random seed queries change the sequence When the current seed of the pseudo-random generator is queried with CALL RANDOM_SEED(GET=n), that query should not change the stream of pseudo-random numbers produced by CALL RANDOM_NUMBER(). Differential Revision: https://reviews.llvm.org/D127023	2022-06-04 08:01:46 -07:00
Mehdi Amini	369ce54bb3	Revert "[MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration." This reverts commit `bcfc0a9051`. The build is broken with shared library enabled.	2022-06-04 08:35:45 +00:00
Fangrui Song	36c7d79dc4	Remove unneeded cl::ZeroOrMore for cl::opt options Similar to `557efc9a8b`. This commit handles options where cl::ZeroOrMore is more than one line below cl::opt.	2022-06-04 00:10:42 -07:00
Christian Sigg	bcfc0a9051	[MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration. This is correct for all values, i.e. the same as promoting the division to fp32 in the NVPTX backend. But it is faster (~10% in average, sometimes more) because: - it performs less Newton iterations - it avoids the slow path for e.g. denormals - it allows reuse of the reciprocal for multiple divisions by the same divisor Test program: ``` #include <stdio.h> #include "cuda_fp16.h" // This is a variant of CUDA's own __hdiv which is fast than hdiv_promote below // and doesn't suffer from the perf cliff of div.rn.fp32 with 'special' values. __device__ half hdiv_newton(half a, half b) { float fa = __half2float(a); float fb = __half2float(b); float rcp; asm("{rcp.approx.ftz.f32 %0, %1;\n}" : "=f"(rcp) : "f"(fb)); float result = fa * rcp; auto exponent = reinterpret_cast<const unsigned&>(result) & 0x7f800000; if (exponent != 0 && exponent != 0x7f800000) { float err = __fmaf_rn(-fb, result, fa); result = __fmaf_rn(rcp, err, result); } return __float2half(result); } // Surprisingly, this is faster than CUDA's own __hdiv. __device__ half hdiv_promote(half a, half b) { return __float2half(__half2float(a) / __half2float(b)); } // This is an approximation that is accurate up to 1 ulp. __device__ half hdiv_approx(half a, half b) { float fa = __half2float(a); float fb = __half2float(b); float result; asm("{div.approx.ftz.f32 %0, %1, %2;\n}" : "=f"(result) : "f"(fa), "f"(fb)); return __float2half(result); } __global__ void CheckCorrectness() { int i = threadIdx.x + blockIdx.x * blockDim.x; half x = reinterpret_cast<const half&>(i); for (int j = 0; j < 65536; ++j) { half y = reinterpret_cast<const half&>(j); half d1 = hdiv_newton(x, y); half d2 = hdiv_promote(x, y); auto s1 = reinterpret_cast<const short&>(d1); auto s2 = reinterpret_cast<const short&>(d2); if (s1 != s2) { printf("%f (%u) / %f (%u), got %f (%hu), expected: %f (%hu)\n", __half2float(x), i, __half2float(y), j, __half2float(d1), s1, __half2float(d2), s2); //__trap(); } } } __device__ half dst; __global__ void ProfileBuiltin(half x) { #pragma unroll 1 for (int i = 0; i < 10000000; ++i) { x = x / x; } dst = x; } __global__ void ProfilePromote(half x) { #pragma unroll 1 for (int i = 0; i < 10000000; ++i) { x = hdiv_promote(x, x); } dst = x; } __global__ void ProfileNewton(half x) { #pragma unroll 1 for (int i = 0; i < 10000000; ++i) { x = hdiv_newton(x, x); } dst = x; } __global__ void ProfileApprox(half x) { #pragma unroll 1 for (int i = 0; i < 10000000; ++i) { x = hdiv_approx(x, x); } dst = x; } int main() { CheckCorrectness<<<256, 256>>>(); half one = __float2half(1.0f); ProfileBuiltin<<<1, 1>>>(one); // 1.001s ProfilePromote<<<1, 1>>>(one); // 0.560s ProfileNewton<<<1, 1>>>(one); // 0.508s ProfileApprox<<<1, 1>>>(one); // 0.304s auto status = cudaDeviceSynchronize(); printf("%s\n", cudaGetErrorString(status)); } ``` Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D126158	2022-06-04 08:03:29 +02:00
Peter Klausler	9c54d76251	[flang][runtime] Signal new I/O error on floating-point input overflow Besides raising the IEEE floating-point overflow exception, treat a floating-point overflow on input as an I/O error catchable with ERR=, IOSTAT=, &/or IOMSG=. Differential Revision: https://reviews.llvm.org/D127022	2022-06-03 22:55:03 -07:00
Amir Ayupov	b346af6d44	[BOLT][UTILS] Usability improvements for nfc-check-setup # Stash local changes before checkout. # Print a message that the source repository revision has been changed, with instructions to switch back. # Make the script executable. # Print sample instructions how to run bolt tests. # Assume that llvm-bolt-wrapper script is in the same source directory. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D126941	2022-06-03 22:54:56 -07:00
Peter Klausler	08c6a32381	[flang] Don't discard lower bounds of implicit-shape named constants F18 preserves lower bounds of explicit-shape named constant arrays, but failed to also do so for implicit-shape named constants. Fix. Differential Revision: https://reviews.llvm.org/D127021	2022-06-03 22:45:12 -07:00
Peter Klausler	f3278e0f3c	[flang][runtime] Ensure that 0. <= RANDOM_NUMBER() < 1. It was possible for RANDOM_NUMBER() to return 1.0. Differential Revision: https://reviews.llvm.org/D127020	2022-06-03 22:44:19 -07:00
Fangrui Song	025b309631	Revert D126950 "[lld][WebAssembly] Retain data segments referenced via __start/__stop" This reverts commit `dcf3368e33`. It breaks -DLLVM_ENABLE_ASSERTIONS=on builds. In addition, the description is incorrect about ld.lld behavior. For wasm, there should be justification to add the new mode.	2022-06-03 22:18:06 -07:00
Peter Klausler	15faac900d	[flang] Distinguish intrinsic module USE in module files; correct search paths In the USE statements that f18 emits to module files, ensure that symbols from intrinsic modules are marked as such on their USE statements. And ensure that the current working directory (".") cannot override the intrinsic module search path when trying to locate an intrinsic module. Differential Revision: https://reviews.llvm.org/D127019	2022-06-03 22:07:44 -07:00
Fangrui Song	72f9c69421	[Hexagon][bolt] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC Similar to `557efc9a8b`	2022-06-03 22:04:57 -07:00
Fangrui Song	734c223445	[clang-link-wrapper] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC Similar to `557efc9a8b`	2022-06-03 22:02:11 -07:00
Fangrui Song	557efc9a8b	[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!` error. More were added due to cargo cult. Since the error has been removed, cl::ZeroOrMore is unneeded. Also remove cl::init(false) while touching the lines.	2022-06-03 21:59:05 -07:00
LiaoChunyu	f14d18c7a9	[RISCV] Add more patterns for FNMADD D54205 handles fnmadd: -rs1 * rs2 - rs3 This patch add fnmadd: -(rs1 * rs2 + rs3) (the nsz flag on the FMA) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D126852	2022-06-04 12:31:45 +08:00
varconst	7c63cc198b	[libc++][ranges][NFC] Fix a patch link in ranges status.	2022-06-03 20:39:00 -07:00
varconst	faf43ad7ae	[libc++][ranges][NFC] Mark range algorithms that are in progress.	2022-06-03 20:02:46 -07:00
Yuta Saito	dcf3368e33	[lld][WebAssembly] Retain data segments referenced via __start/__stop As well as ELF linker does, retain all data segments named X referenced through `__start_X` or `__stop_X`. For example, `FOO_MD` should not be stripped in the below case, but it's currently mis-stripped ```llvm @FOO_MD = global [4 x i8] c"bar\00", section "foo_md", align 1 @__start_foo_md = external constant i8* @__stop_foo_md = external constant i8* @llvm.used = appending global [1 x i8] [i8 bitcast (i32 ()* @foo_md_size to i8)], section "llvm.metadata" define i32 @foo_md_size() { entry: ret i32 sub ( i32 ptrtoint (i8* @__stop_foo_md to i32), i32 ptrtoint (i8** @__start_foo_md to i32) ) } ``` This fixes https://github.com/llvm/llvm-project/issues/55839 Reviewed By: sbc100 Differential Revision: https://reviews.llvm.org/D126950	2022-06-04 02:28:31 +00:00
Peter Klausler	e0adee8481	[flang] Correct folding of CSHIFT and EOSHIFT for DIM>1 The algorithm was wrong for higher dimensions, and so were the expected test results. Rework. Differential Revision: https://reviews.llvm.org/D127018	2022-06-03 18:59:44 -07:00
Fangrui Song	47ec8b5574	[pseudo] Fix leaks after D126731 Array Operator new Cookies help lsan find allocations, while std::array can't.	2022-06-03 18:43:16 -07:00
Peter Klausler	aa77cf90aa	[flang][runtime] Signal format error when input field width is zero A data edit descriptor for input may not have a zero field width. Differential Revision: https://reviews.llvm.org/D127017	2022-06-03 18:11:00 -07:00
Peter Klausler	e5a4f730da	[flang][runtime] OPEN write-only files If a file being opened with no ACTION= is write-only then cope with it rather than defaulting prematurely to treating it as read-only. Differential Revision: https://reviews.llvm.org/D127015	2022-06-03 18:09:40 -07:00
Craig Topper	cc3bd43533	[RISCV] Support LUI+ADDIW in doPeepholeLoadStoreADDI. This fixes an inconsistency between RV32 and RV64. Still considering trying to do this peephole during isel, but wanted to fix the inconsistency first. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126986	2022-06-03 18:06:56 -07:00
Peter Klausler	9878facfd0	[flang][runtime] INQUIRE(FILE="...",SIZE=nbytes) Implement inquire-by-file SIZE= specifier. Differential Revision: https://reviews.llvm.org/D127014	2022-06-03 18:05:27 -07:00
Jake Egan	c3c75d805c	[clang][test] Mark test arm-float-abi-lto.c unsupported on AIX This test is failing after the introduction of opaque pointers (https://reviews.llvm.org/D125847). The test is flaky and fails from segmentation fault, but it's unclear why. So, mark this test unsupported while it's investigated.	2022-06-03 21:04:56 -04:00
Paul Pluzhnikov	490990bb1f	[test] Modify test to verify D126396 (Clean "./" from __FILE__ expansion) Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D127009	2022-06-03 17:54:03 -07:00
Peter Klausler	da63fee0d0	[flang][runtime] Allow extra character for E0.0 output editing When the digit count ('d') is zero in E0 editing, allow for one more output character; otherwise, any - or + sign in the output causes an output field overflow. Differential Revision: https://reviews.llvm.org/D127013	2022-06-03 17:41:22 -07:00
wren romano	3cf03f1c56	[mlir][sparse] Adding IsSparseTensorPred and updating ops to use it Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D126994	2022-06-03 17:15:31 -07:00
Peter Klausler	604016dbe4	[flang][runtime] Fix bug with extra leading zero in octal output Octal (O) output editing often emits an extra leading 0 digit due to the total digit count being off by one since word sizes aren't multiples of three bits. Differential Revision: https://reviews.llvm.org/D127012	2022-06-03 17:02:07 -07:00
Peter Klausler	66a871b973	[flang] Fix crash in IsSaved() Code was accessing ProcEntityDetails in a symbol that didn't have them. Differential Revision: https://reviews.llvm.org/D127011	2022-06-03 17:00:01 -07:00
Florian Mayer	53c1584063	[NFC] [libunwind] turn assert into static_assert Reviewed By: #libunwind, MaskRay Differential Revision: https://reviews.llvm.org/D126987	2022-06-03 16:32:42 -07:00
Clemens Wasser	42c7f494d9	[tools] Forward declare classes & remove includes Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D120208	2022-06-03 16:32:04 -07:00
Christopher Bate	9f819f4c62	[mlir][linalg] fix crash in vectorization of elementwise operations The current vectorization logic implicitly expects "elementwise" linalg ops to have projected permutations for indexing maps, but the precondition logic misses this check. This can result in a crash when executing the generic vectorization transform on an op with a non-projected permutation input indexing map. This change fixes the logic and adds a test (which crashes without this fix). Differential Revision: https://reviews.llvm.org/D127000	2022-06-03 16:38:13 -06:00

1 2 3 4 5 ...

425669 Commits All Branches Search

425669 Commits

All Branches