[flang] edits

Original-commit: flang-compiler/f18@729cc19e0f
Reviewed-on: https://github.com/flang-compiler/f18/pull/534
Tree-same-pre-rewrite: false
This commit is contained in:
peter klausler 2019-06-28 13:48:57 -07:00
parent 28e8f7a9fd
commit f0778f0fe2
1 changed files with 27 additions and 12 deletions

View File

@ -5,13 +5,15 @@ Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
This note attempts to describe the motivation for and design of an
implementation of Fortran 90 (and later) array expression evaluation that
minimizes the use of dynamically allocated temporary storage for
the results of calls to transformational intrinsic functions.
the results of calls to transformational intrinsic functions, and
making them more amenable to acceleration.
The transformational intrinsic functions of Fortran of interest to
us here include:
* Reductions to scalars (`SUM(X)`, also `ALL`, `ANY`, `COUNT`, `IALL`,
`IANY`, `IPARITY`, `MAXVAL`, `MINVAL`, `PARITY`, `PRODUCT`)
* Reductions to scalars (`SUM(X)`, also `ALL`, `ANY`, `COUNT`,
`DOT_PRODUCT`,
`IALL`, `IANY`, `IPARITY`, `MAXVAL`, `MINVAL`, `PARITY`, `PRODUCT`)
* Axial reductions (`SUM(X,DIM=)`, &c.)
* Location reductions to indices (`MAXLOC`, `MINLOC`, `FINDLOC`)
* Axial location reductions (`MAXLOC(DIM=`, &c.)
@ -21,6 +23,7 @@ us here include:
* `CSHIFT` and `EOSHIFT` with scalar `SHIFT=`
* `CSHIFT` and `EOSHIFT` with array-valued `SHIFT=`
* `PACK` and `UNPACK`
* `MATMUL`
Other Fortran intrinsic functions are technically transformational (e.g.,
`COMMAND_ARGUMENT_COUNT`) but not of interest for this note.
@ -48,19 +51,25 @@ in the expression.
Consider `B = A + 1.0` (assuming `REAL :: A(N,M), B(N,M)`).
The right-hand side of that assignment could be evaluated into a
temporary array `T` and then subscripted as it is copied into `A`.
temporary array `T` and then subscripted as it is copied into `B`.
```
REAL, ALLOCATABLE :: T(:,:)
ALLOCATE(T(N,M))
FORALL(J=1:N,K=1:M) T(J,K)=A(J,K) + 1.0
FORALL(J=1:N,K=1:M) B(J,K)=T(J,K)
DO CONCURRENT(J=1:N,K=1:M)
T(J,K)=A(J,K) + 1.0
END DO
DO CONCURRENT(J=1:N,K=1:M)
B(J,K)=T(J,K)
END DO
DEALLOCATE(T(N,M))
```
But we can avoid the allocation, population, and deallocation of
the temporary by treating the right-hand side expression as if it
were a statement function `F(J,K)=A(J,K)+1.0` and evaluating
```
FORALL(J=1:N,K=1:M) A(J,K)=F(J,K)
DO CONCURRENT(J=1:N,K=1:M)
A(J,K)=F(J,K)
END DO
```
In general, when a Fortran array assignment to a non-allocatable array
@ -79,6 +88,10 @@ functions of the "incoming" indices.
For example, the application of `TRANSPOSE(A + 1.0)` to the index
tuple `(J,K)` becomes `A(K,J) + 1.0`.
Partial (axial) reductions can be similarly composed.
The application of `SUM(A,DIM=2)` to the index `J` is the
complete reduction `SUM(A(J,:))`.
Determination of rank and shape
===============================
An important part of evaluating array expressions without the use of
@ -88,16 +101,19 @@ or without, evaluating the elements of the result.
The shapes of array objects, results of elemental intrinsic functions,
and results of intrinsic operations are obvious.
But it is possible to determine the shapes of the results of many
transformantional intrinsic function calls as well.
transformational intrinsic function calls as well.
* `SHAPE(SUM(X,DIM=d))` is `SHAPE(X)` with one element removed.
The `DIM=` argument is commonly a compile-time constant.
* `SHAPE(SUM(X,DIM=d))` is `SHAPE(X)` with one element removed:
`PACK(SHAPE(X),[(j,j=1,RANK(X))]/=d)` in general.
(The `DIM=` argument is commonly a compile-time constant.)
* `SHAPE(MAXLOC(X))` is `[RANK(X)]`.
* `SHAPE(MAXLOC(X,DIM=d))` is `SHAPE(X)` with one element removed.
* `SHAPE(TRANSPOSE(M))` is a reversal of `SHAPE(M)`.
* `SHAPE(RESHAPE(..., SHAPE=S))` is `S`.
* `SHAPE(CSHIFT(X))` is `SHAPE(X)`; same with `EOSHIFT`.
* `SHAPE(PACK(A,VECTOR=V))` is `SHAPE(V)`; `RANK(PACK(...))` is always 1.
* `SHAPE(PACK(A,VECTOR=V))` is `SHAPE(V)`
* `SHAPE(PACK(A,MASK=m))` with non-scalar `m` and without `VECTOR=` is `[COUNT(m)]`.
* `RANK(PACK(...))` is always 1.
* `SHAPE(UNPACK(MASK=M))` is `SHAPE(M)`.
* `SHAPE(SHAPE(X))` is `[RANK(X)]`.
@ -127,4 +143,3 @@ new shape.
The implementation of this feature also becomes more straightforward if
our implementation of array expressions has decoupled calculation of shapes
from the evaluation of the elements of the result.