llvm-project/flang/documentation/ArrayComposition.md

<!--===- documentation/ArrayComposition.md 
  
   Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
   See https://llvm.org/LICENSE.txt for license information.
   SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
  
-->

This note attempts to describe the motivation for and design of an
implementation of Fortran 90 (and later) array expression evaluation that
minimizes the use of dynamically allocated temporary storage for
the results of calls to transformational intrinsic functions, and
making them more amenable to acceleration.

The transformational intrinsic functions of Fortran of interest to
us here include:

* Reductions to scalars (`SUM(X)`, also `ALL`, `ANY`, `COUNT`,
  `DOT_PRODUCT`,
  `IALL`, `IANY`, `IPARITY`, `MAXVAL`, `MINVAL`, `PARITY`, `PRODUCT`)
* Axial reductions (`SUM(X,DIM=)`, &c.)
* Location reductions to indices (`MAXLOC`, `MINLOC`, `FINDLOC`)
* Axial location reductions (`MAXLOC(DIM=`, &c.)
* `TRANSPOSE(M)` matrix transposition
* `RESHAPE` without `ORDER=`
* `RESHAPE` with `ORDER=`
* `CSHIFT` and `EOSHIFT` with scalar `SHIFT=`
* `CSHIFT` and `EOSHIFT` with array-valued `SHIFT=`
* `PACK` and `UNPACK`
* `MATMUL`
* `SPREAD`

Other Fortran intrinsic functions are technically transformational (e.g.,
`COMMAND_ARGUMENT_COUNT`) but not of interest for this note.
The generic `REDUCE` is also not considered here.

Arrays as functions
===================
A whole array can be viewed as a function that maps its indices to the values
of its elements.
Specifically, it is a map from a tuple of integers to its element type.
The rank of the array is the number of elements in that tuple,
and the shape of the array delimits the domain of the map.

`REAL :: A(N,M)` can be seen as a function mapping ordered pairs of integers
`(J,K)` with `1<=J<=N` and `1<=J<=M` to real values.

Array expressions as functions
==============================
The same perspective can be taken of an array expression comprising
intrinsic operators and elemental functions.
Fortran doesn't allow one to apply subscripts directly to an expression,
but expressions have rank and shape, and one can view array expressions
as functions over index tuples by applying those indices to the arrays
and subexpressions in the expression.

Consider `B = A + 1.0` (assuming `REAL :: A(N,M), B(N,M)`).
The right-hand side of that assignment could be evaluated into a
temporary array `T` and then subscripted as it is copied into `B`.
```
REAL, ALLOCATABLE :: T(:,:)
ALLOCATE(T(N,M))
DO CONCURRENT(J=1:N,K=1:M)
  T(J,K)=A(J,K) + 1.0
END DO
DO CONCURRENT(J=1:N,K=1:M)
  B(J,K)=T(J,K)
END DO
DEALLOCATE(T)
```
But we can avoid the allocation, population, and deallocation of
the temporary by treating the right-hand side expression as if it
were a statement function `F(J,K)=A(J,K)+1.0` and evaluating
```
DO CONCURRENT(J=1:N,K=1:M)
  A(J,K)=F(J,K)
END DO
```

In general, when a Fortran array assignment to a non-allocatable array
does not include the left-hand
side variable as an operand of the right-hand side expression, and any
function calls on the right-hand side are elemental or scalar-valued,
we can avoid the use of a temporary.

Transformational intrinsic functions as function composition
============================================================
Many of the transformational intrinsic functions listed above
can, when their array arguments are viewed as functions over their
index tuples, be seen as compositions of those functions with
functions of the "incoming" indices -- yielding a function for
an entire right-hand side of an array assignment statement.

For example, the application of `TRANSPOSE(A + 1.0)` to the index
tuple `(J,K)` becomes `A(K,J) + 1.0`.

Partial (axial) reductions can be similarly composed.
The application of `SUM(A,DIM=2)` to the index `J` is the
complete reduction `SUM(A(J,:))`.

More completely:
* Reductions to scalars (`SUM(X)` without `DIM=`) become
  runtime calls; the result needs no dynamic allocation,
  being a scalar.
* Axial reductions (`SUM(X,DIM=d)`) applied to indices `(J,K)`
  become scalar values like `SUM(X(J,K,:))` if `d=3`.
* Location reductions to indices (`MAXLOC(X)` without `DIM=`)
  do not require dynamic allocation, since their results are
  either scalar or small vectors of length `RANK(X)`.
* Axial location reductions (`MAXLOC(X,DIM=)`, &c.)
  are handled like other axial reductions like `SUM(DIM=)`.
* `TRANSPOSE(M)` exchanges the two components of the index tuple.
* `RESHAPE(A,SHAPE=s)` without `ORDER=` must precompute the shape
  vector `S`, and then use it to linearize indices into offsets
  in the storage order of `A` (whose shape must also be captured).
  These conversions can involve division and/or modulus, which
  can be optimized into a fixed-point multiplication using the
  usual technique.
* `RESHAPE` with `ORDER=` is similar, but must permute the
  components of the index tuple; it generalizes `TRANSPOSE`.
* `CSHIFT` applies addition and modulus.
* `EOSHIFT` applies addition and a conditional move (`MERGE`).
* `PACK` and `UNPACK` are likely to require a runtime call.
* `MATMUL(A,B)` can become `DOT_PRODUCT(A(J,:),B(:,K))`, but
  might benefit from calling a highly optimized runtime
  routine.
* `SPREAD(A,DIM=d,NCOPIES=n)` for compile-time `d` simply
  applies `A` to a reduced index tuple.

Determination of rank and shape
===============================
An important part of evaluating array expressions without the use of
temporary storage is determining the shape of the result prior to,
or without, evaluating the elements of the result.

The shapes of array objects, results of elemental intrinsic functions,
and results of intrinsic operations are obvious.
But it is possible to determine the shapes of the results of many
transformational intrinsic function calls as well.

* `SHAPE(SUM(X,DIM=d))` is `SHAPE(X)` with one element removed:
  `PACK(SHAPE(X),[(j,j=1,RANK(X))]/=d)` in general.
  (The `DIM=` argument is commonly a compile-time constant.)
* `SHAPE(MAXLOC(X))` is `[RANK(X)]`.
* `SHAPE(MAXLOC(X,DIM=d))` is `SHAPE(X)` with one element removed.
* `SHAPE(TRANSPOSE(M))` is a reversal of `SHAPE(M)`.
* `SHAPE(RESHAPE(..., SHAPE=S))` is `S`.
* `SHAPE(CSHIFT(X))` is `SHAPE(X)`; same with `EOSHIFT`.
* `SHAPE(PACK(A,VECTOR=V))` is `SHAPE(V)`
* `SHAPE(PACK(A,MASK=m))` with non-scalar `m` and without `VECTOR=` is `[COUNT(m)]`.
* `RANK(PACK(...))` is always 1.
* `SHAPE(UNPACK(MASK=M))` is `SHAPE(M)`.
* `SHAPE(MATMUL(A,B))` drops one value from `SHAPE(A)` and another from `SHAPE(B)`.
* `SHAPE(SHAPE(X))` is `[RANK(X)]`.
* `SHAPE(SPREAD(A,DIM=d,NCOPIES=n))` is `SHAPE(A)` with `n` inserted at
  dimension `d`.

This is useful because expression evaluations that *do* require temporaries
to hold their results (due to the context in which the evaluation occurs)
can be implemented with a separation of the allocation
of the temporary array and the population of the array.
The code that evaluates the expression, or that implements a transformational
intrinsic in the runtime library, can be designed with an API that includes
a pointer to the destination array as an argument.

Statements like `ALLOCATE(A,SOURCE=expression)` should thus be capable
of evaluating their array expressions directly into the newly-allocated
storage for the allocatable array.
The implementation would generate code to calculate the shape, use it
to allocate the memory and populate the descriptor, and then drive a
loop nest around the expression to populate the array.
In cases where the analyzed shape is known at compile time, we should
be able to have the opportunity to avoid heap allocation in favor of
stack storage, if the scope of the variable is local.

Automatic reallocation of allocatables
======================================
Fortran 2003 introduced the ability to assign non-conforming array expressions
to ALLOCATABLE arrays with the implied semantics of reallocation to the
new shape.
The implementation of this feature also becomes more straightforward if
our implementation of array expressions has decoupled calculation of shapes
from the evaluation of the elements of the result.

Rewriting rules
===============
Let `{...}` denote an ordered tuple of 1-based indices, e.g. `{j,k}`, into
the result of an array expression or subexpression.

* Array constructors always yield vectors; higher-rank arrays that appear as
  constituents are flattened; so `[X] => RESHAPE(X,SHAPE=[SIZE(X)})`.
* Array constructors with multiple constituents are concatenations of
  their constituents; so `[X,Y]{j} => MERGE(Y{j-SIZE(X)},X{j},J>SIZE(X))`.
* Array constructors with implied DO loops are difficult when nested
  triangularly.
* Whole array references can have lower bounds other than 1, so
  `A => A(LBOUND(A,1):UBOUND(A,1),...)`.
* Array sections simply apply indices: `A(i:...:n){j} => A(i1+n*(j-1))`.
* Vector-valued subscripts apply indices to the subscript: `A(N(:)){j} => A(N(:){j})`.
* Scalar operands ignore indices: `X{j,k} => X`.
  Further, they are evaluated at most once.
* Elemental operators and functions apply indices to their arguments:
  `(A(:,:) + B(:,:)){j,k}` => A(:,:){j,k} + B(:,:){j,k}`.
* `TRANSPOSE(X){j,k} => X{k,j}`.
* `SPREAD(X,DIM=2,...){j,k} => X{j}`; i.e., the contents are replicated.
  If X is sufficiently expensive to compute elementally, it might be evaluated
  into a temporary.

(more...)
[flang] Flang relicensing changes for LLVM Apache 2.0 license This changes the license information in many of the flang source files. - Renamed LICENSE to LICENSE.txt. - NVIDIA Copyright lines have been removed. - Initial lines for files follow the LLVM coding convention (file name on the first line; Emacs mode information on the first line). - License references have been replaced with the abridged LLVM text. - License information was removed from the test files. - No file header was placed on test files (these weren't in most LLVM test files). - License information was added to documentation files where it was missing. We did not add brief file summaries to the initial line. See http://llvm.org/docs/DeveloperPolicy.html#new-llvm-project-license-framework for a description of the new license. See http://llvm.org/docs/CodingStandards.html#file-headers for a description of the new LLVM standard file header. Original-commit: flang-compiler/f18@add6cde7244b926ca457a1c74c21fea071b85329 Reviewed-on: https://github.com/flang-compiler/f18/pull/887 2019-12-21 04:52:07 +08:00			`<!--===- documentation/ArrayComposition.md`

			`Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.`
			`See https://llvm.org/LICENSE.txt for license information.`
			`SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception`

[flang] Jot down thoughts on array expr and intrinsic evaluation for Jean Original-commit: flang-compiler/f18@83c72062d57f737955d8d3ebdfed4a08f5a9107b Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-28 07:01:20 +08:00			`-->`

			`This note attempts to describe the motivation for and design of an`
			`implementation of Fortran 90 (and later) array expression evaluation that`
			`minimizes the use of dynamically allocated temporary storage for`
[flang] edits Original-commit: flang-compiler/f18@729cc19e0f8e18cd13026723df6578c7e786c9c4 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-29 04:48:57 +08:00			`the results of calls to transformational intrinsic functions, and`
			`making them more amenable to acceleration.`
[flang] Jot down thoughts on array expr and intrinsic evaluation for Jean Original-commit: flang-compiler/f18@83c72062d57f737955d8d3ebdfed4a08f5a9107b Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-28 07:01:20 +08:00
			`The transformational intrinsic functions of Fortran of interest to`
			`us here include:`

[flang] edits Original-commit: flang-compiler/f18@729cc19e0f8e18cd13026723df6578c7e786c9c4 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-29 04:48:57 +08:00			* Reductions to scalars (`SUM(X)`, also `ALL`, `ANY`, `COUNT`,
			`DOT_PRODUCT`,
			`IALL`, `IANY`, `IPARITY`, `MAXVAL`, `MINVAL`, `PARITY`, `PRODUCT`)
[flang] Jot down thoughts on array expr and intrinsic evaluation for Jean Original-commit: flang-compiler/f18@83c72062d57f737955d8d3ebdfed4a08f5a9107b Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-28 07:01:20 +08:00			* Axial reductions (`SUM(X,DIM=)`, &c.)
			* Location reductions to indices (`MAXLOC`, `MINLOC`, `FINDLOC`)
			* Axial location reductions (`MAXLOC(DIM=`, &c.)
			* `TRANSPOSE(M)` matrix transposition
			* `RESHAPE` without `ORDER=`
			* `RESHAPE` with `ORDER=`
			* `CSHIFT` and `EOSHIFT` with scalar `SHIFT=`
			* `CSHIFT` and `EOSHIFT` with array-valued `SHIFT=`
			* `PACK` and `UNPACK`
[flang] edits Original-commit: flang-compiler/f18@729cc19e0f8e18cd13026723df6578c7e786c9c4 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-29 04:48:57 +08:00			* `MATMUL`
[flang] edits Original-commit: flang-compiler/f18@ff3cab0bb5585f5a4386146c9b7eb9a79be04385 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-07-03 02:45:35 +08:00			* `SPREAD`
[flang] Jot down thoughts on array expr and intrinsic evaluation for Jean Original-commit: flang-compiler/f18@83c72062d57f737955d8d3ebdfed4a08f5a9107b Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-28 07:01:20 +08:00
			`Other Fortran intrinsic functions are technically transformational (e.g.,`
			`COMMAND_ARGUMENT_COUNT`) but not of interest for this note.
			The generic `REDUCE` is also not considered here.

			`Arrays as functions`
			`===================`
			`A whole array can be viewed as a function that maps its indices to the values`
			`of its elements.`
			`Specifically, it is a map from a tuple of integers to its element type.`
			`The rank of the array is the number of elements in that tuple,`
			`and the shape of the array delimits the domain of the map.`

			`REAL :: A(N,M)` can be seen as a function mapping ordered pairs of integers
			`(J,K)` with `1<=J<=N` and `1<=J<=M` to real values.

			`Array expressions as functions`
			`==============================`
			`The same perspective can be taken of an array expression comprising`
			`intrinsic operators and elemental functions.`
			`Fortran doesn't allow one to apply subscripts directly to an expression,`
			`but expressions have rank and shape, and one can view array expressions`
			`as functions over index tuples by applying those indices to the arrays`
[flang] edits Original-commit: flang-compiler/f18@07da944e4b9ab01880c3770d826ecf044958cad6 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-29 04:53:05 +08:00			`and subexpressions in the expression.`
[flang] Jot down thoughts on array expr and intrinsic evaluation for Jean Original-commit: flang-compiler/f18@83c72062d57f737955d8d3ebdfed4a08f5a9107b Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-28 07:01:20 +08:00
			Consider `B = A + 1.0` (assuming `REAL :: A(N,M), B(N,M)`).
			`The right-hand side of that assignment could be evaluated into a`
[flang] edits Original-commit: flang-compiler/f18@729cc19e0f8e18cd13026723df6578c7e786c9c4 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-29 04:48:57 +08:00			temporary array `T` and then subscripted as it is copied into `B`.
[flang] Jot down thoughts on array expr and intrinsic evaluation for Jean Original-commit: flang-compiler/f18@83c72062d57f737955d8d3ebdfed4a08f5a9107b Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-28 07:01:20 +08:00			```
			`REAL, ALLOCATABLE :: T(:,:)`
			`ALLOCATE(T(N,M))`
[flang] edits Original-commit: flang-compiler/f18@729cc19e0f8e18cd13026723df6578c7e786c9c4 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-29 04:48:57 +08:00			`DO CONCURRENT(J=1:N,K=1:M)`
			`T(J,K)=A(J,K) + 1.0`
			`END DO`
			`DO CONCURRENT(J=1:N,K=1:M)`
			`B(J,K)=T(J,K)`
			`END DO`
[flang] edits Original-commit: flang-compiler/f18@07da944e4b9ab01880c3770d826ecf044958cad6 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-29 04:53:05 +08:00			`DEALLOCATE(T)`
[flang] Jot down thoughts on array expr and intrinsic evaluation for Jean Original-commit: flang-compiler/f18@83c72062d57f737955d8d3ebdfed4a08f5a9107b Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-28 07:01:20 +08:00			```
			`But we can avoid the allocation, population, and deallocation of`
			`the temporary by treating the right-hand side expression as if it`
			were a statement function `F(J,K)=A(J,K)+1.0` and evaluating
			```
[flang] edits Original-commit: flang-compiler/f18@729cc19e0f8e18cd13026723df6578c7e786c9c4 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-29 04:48:57 +08:00			`DO CONCURRENT(J=1:N,K=1:M)`
			`A(J,K)=F(J,K)`
			`END DO`
[flang] Jot down thoughts on array expr and intrinsic evaluation for Jean Original-commit: flang-compiler/f18@83c72062d57f737955d8d3ebdfed4a08f5a9107b Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-28 07:01:20 +08:00			```

			`In general, when a Fortran array assignment to a non-allocatable array`
			`does not include the left-hand`
			`side variable as an operand of the right-hand side expression, and any`
			`function calls on the right-hand side are elemental or scalar-valued,`
			`we can avoid the use of a temporary.`

			`Transformational intrinsic functions as function composition`
			`============================================================`
			`Many of the transformational intrinsic functions listed above`
			`can, when their array arguments are viewed as functions over their`
			`index tuples, be seen as compositions of those functions with`
[flang] edits Original-commit: flang-compiler/f18@07da944e4b9ab01880c3770d826ecf044958cad6 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-29 04:53:05 +08:00			`functions of the "incoming" indices -- yielding a function for`
			`an entire right-hand side of an array assignment statement.`
[flang] Jot down thoughts on array expr and intrinsic evaluation for Jean Original-commit: flang-compiler/f18@83c72062d57f737955d8d3ebdfed4a08f5a9107b Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-28 07:01:20 +08:00
			For example, the application of `TRANSPOSE(A + 1.0)` to the index
			tuple `(J,K)` becomes `A(K,J) + 1.0`.

[flang] edits Original-commit: flang-compiler/f18@729cc19e0f8e18cd13026723df6578c7e786c9c4 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-29 04:48:57 +08:00			`Partial (axial) reductions can be similarly composed.`
			The application of `SUM(A,DIM=2)` to the index `J` is the
			complete reduction `SUM(A(J,:))`.

[flang] More edits Original-commit: flang-compiler/f18@6b9ce52250373644098930ee6f28cb4001095ba5 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-29 05:45:27 +08:00			`More completely:`
			* Reductions to scalars (`SUM(X)` without `DIM=`) become
			`runtime calls; the result needs no dynamic allocation,`
			`being a scalar.`
			* Axial reductions (`SUM(X,DIM=d)`) applied to indices `(J,K)`
			become scalar values like `SUM(X(J,K,:))` if `d=3`.
			* Location reductions to indices (`MAXLOC(X)` without `DIM=`)
			`do not require dynamic allocation, since their results are`
			either scalar or small vectors of length `RANK(X)`.
			* Axial location reductions (`MAXLOC(X,DIM=)`, &c.)
			are handled like other axial reductions like `SUM(DIM=)`.
			* `TRANSPOSE(M)` exchanges the two components of the index tuple.
			* `RESHAPE(A,SHAPE=s)` without `ORDER=` must precompute the shape
			vector `S`, and then use it to linearize indices into offsets
			in the storage order of `A` (whose shape must also be captured).
			`These conversions can involve division and/or modulus, which`
			`can be optimized into a fixed-point multiplication using the`
			`usual technique.`
			* `RESHAPE` with `ORDER=` is similar, but must permute the
			components of the index tuple; it generalizes `TRANSPOSE`.
			* `CSHIFT` applies addition and modulus.
[flang] edits Original-commit: flang-compiler/f18@ff3cab0bb5585f5a4386146c9b7eb9a79be04385 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-07-03 02:45:35 +08:00			* `EOSHIFT` applies addition and a conditional move (`MERGE`).
[flang] More edits Original-commit: flang-compiler/f18@6b9ce52250373644098930ee6f28cb4001095ba5 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-29 05:45:27 +08:00			* `PACK` and `UNPACK` are likely to require a runtime call.
			* `MATMUL(A,B)` can become `DOT_PRODUCT(A(J,:),B(:,K))`, but
			`might benefit from calling a highly optimized runtime`
			`routine.`
[flang] edits Original-commit: flang-compiler/f18@ff3cab0bb5585f5a4386146c9b7eb9a79be04385 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-07-03 02:45:35 +08:00			* `SPREAD(A,DIM=d,NCOPIES=n)` for compile-time `d` simply
			applies `A` to a reduced index tuple.
[flang] More edits Original-commit: flang-compiler/f18@6b9ce52250373644098930ee6f28cb4001095ba5 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-29 05:45:27 +08:00
[flang] Jot down thoughts on array expr and intrinsic evaluation for Jean Original-commit: flang-compiler/f18@83c72062d57f737955d8d3ebdfed4a08f5a9107b Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-28 07:01:20 +08:00			`Determination of rank and shape`
			`===============================`
			`An important part of evaluating array expressions without the use of`
			`temporary storage is determining the shape of the result prior to,`
			`or without, evaluating the elements of the result.`

			`The shapes of array objects, results of elemental intrinsic functions,`
			`and results of intrinsic operations are obvious.`
			`But it is possible to determine the shapes of the results of many`
[flang] edits Original-commit: flang-compiler/f18@729cc19e0f8e18cd13026723df6578c7e786c9c4 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-29 04:48:57 +08:00			`transformational intrinsic function calls as well.`
[flang] Jot down thoughts on array expr and intrinsic evaluation for Jean Original-commit: flang-compiler/f18@83c72062d57f737955d8d3ebdfed4a08f5a9107b Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-28 07:01:20 +08:00
[flang] edits Original-commit: flang-compiler/f18@729cc19e0f8e18cd13026723df6578c7e786c9c4 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-29 04:48:57 +08:00			* `SHAPE(SUM(X,DIM=d))` is `SHAPE(X)` with one element removed:
			`PACK(SHAPE(X),[(j,j=1,RANK(X))]/=d)` in general.
			(The `DIM=` argument is commonly a compile-time constant.)
[flang] Jot down thoughts on array expr and intrinsic evaluation for Jean Original-commit: flang-compiler/f18@83c72062d57f737955d8d3ebdfed4a08f5a9107b Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-28 07:01:20 +08:00			* `SHAPE(MAXLOC(X))` is `[RANK(X)]`.
			* `SHAPE(MAXLOC(X,DIM=d))` is `SHAPE(X)` with one element removed.
			* `SHAPE(TRANSPOSE(M))` is a reversal of `SHAPE(M)`.
			* `SHAPE(RESHAPE(..., SHAPE=S))` is `S`.
			* `SHAPE(CSHIFT(X))` is `SHAPE(X)`; same with `EOSHIFT`.
[flang] edits Original-commit: flang-compiler/f18@729cc19e0f8e18cd13026723df6578c7e786c9c4 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-29 04:48:57 +08:00			* `SHAPE(PACK(A,VECTOR=V))` is `SHAPE(V)`
			* `SHAPE(PACK(A,MASK=m))` with non-scalar `m` and without `VECTOR=` is `[COUNT(m)]`.
			* `RANK(PACK(...))` is always 1.
[flang] Jot down thoughts on array expr and intrinsic evaluation for Jean Original-commit: flang-compiler/f18@83c72062d57f737955d8d3ebdfed4a08f5a9107b Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-28 07:01:20 +08:00			* `SHAPE(UNPACK(MASK=M))` is `SHAPE(M)`.
[flang] edits Original-commit: flang-compiler/f18@ff3cab0bb5585f5a4386146c9b7eb9a79be04385 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-07-03 02:45:35 +08:00			* `SHAPE(MATMUL(A,B))` drops one value from `SHAPE(A)` and another from `SHAPE(B)`.
[flang] Jot down thoughts on array expr and intrinsic evaluation for Jean Original-commit: flang-compiler/f18@83c72062d57f737955d8d3ebdfed4a08f5a9107b Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-28 07:01:20 +08:00			* `SHAPE(SHAPE(X))` is `[RANK(X)]`.
[flang] edits Original-commit: flang-compiler/f18@ff3cab0bb5585f5a4386146c9b7eb9a79be04385 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-07-03 02:45:35 +08:00			* `SHAPE(SPREAD(A,DIM=d,NCOPIES=n))` is `SHAPE(A)` with `n` inserted at
			dimension `d`.
[flang] Jot down thoughts on array expr and intrinsic evaluation for Jean Original-commit: flang-compiler/f18@83c72062d57f737955d8d3ebdfed4a08f5a9107b Reviewed-on: https://github.com/flang-compiler/f18/pull/534 Tree-same-pre-rewrite: false 2019-06-28 07:01:20 +08:00
			`This is useful because expression evaluations that do require temporaries`
			`to hold their results (due to the context in which the evaluation occurs)`
			`can be implemented with a separation of the allocation`
			`of the temporary array and the population of the array.`
			`The code that evaluates the expression, or that implements a transformational`
			`intrinsic in the runtime library, can be designed with an API that includes`
			`a pointer to the destination array as an argument.`

			Statements like `ALLOCATE(A,SOURCE=expression)` should thus be capable
			`of evaluating their array expressions directly into the newly-allocated`
			`storage for the allocatable array.`
			`The implementation would generate code to calculate the shape, use it`
			`to allocate the memory and populate the descriptor, and then drive a`
			`loop nest around the expression to populate the array.`
			`In cases where the analyzed shape is known at compile time, we should`
			`be able to have the opportunity to avoid heap allocation in favor of`
			`stack storage, if the scope of the variable is local.`

			`Automatic reallocation of allocatables`
			`======================================`
			`Fortran 2003 introduced the ability to assign non-conforming array expressions`
			`to ALLOCATABLE arrays with the implied semantics of reallocation to the`
			`new shape.`
			`The implementation of this feature also becomes more straightforward if`
			`our implementation of array expressions has decoupled calculation of shapes`
			`from the evaluation of the elements of the result.`
[flang] More writing Original-commit: flang-compiler/f18@8dce9cb7f0487c82f7f2a277e4bea6f7aa1fd3d2 Reviewed-on: https://github.com/flang-compiler/f18/pull/534 2019-07-11 05:32:46 +08:00
			`Rewriting rules`
			`===============`
			Let `{...}` denote an ordered tuple of 1-based indices, e.g. `{j,k}`, into
			`the result of an array expression or subexpression.`

			`* Array constructors always yield vectors; higher-rank arrays that appear as`
			constituents are flattened; so `[X] => RESHAPE(X,SHAPE=[SIZE(X)})`.
			`* Array constructors with multiple constituents are concatenations of`
			their constituents; so `[X,Y]{j} => MERGE(Y{j-SIZE(X)},X{j},J>SIZE(X))`.
			`* Array constructors with implied DO loops are difficult when nested`
			`triangularly.`
			`* Whole array references can have lower bounds other than 1, so`
			`A => A(LBOUND(A,1):UBOUND(A,1),...)`.
			* Array sections simply apply indices: `A(i:...:n){j} => A(i1+n*(j-1))`.
			* Vector-valued subscripts apply indices to the subscript: `A(N(:)){j} => A(N(:){j})`.
			* Scalar operands ignore indices: `X{j,k} => X`.
			`Further, they are evaluated at most once.`
			`* Elemental operators and functions apply indices to their arguments:`
			`(A(:,:) + B(:,:)){j,k}` => A(:,:){j,k} + B(:,:){j,k}`.
			* `TRANSPOSE(X){j,k} => X{k,j}`.
			* `SPREAD(X,DIM=2,...){j,k} => X{j}`; i.e., the contents are replicated.
			`If X is sufficiently expensive to compute elementally, it might be evaluated`
			`into a temporary.`

			`(more...)`