[documentation][llvm-mca] Update the documentation.

Scheduling models can now describe processor register files and retire control
units. This updates the existing documentation and the README file.

llvm-svn: 329311
This commit is contained in:
Andrea Di Biagio 2018-04-05 16:42:32 +00:00
parent 6ecdb03f16
commit efc3f39f02
2 changed files with 19 additions and 40 deletions

View File

@ -65,15 +65,14 @@ option specifies "``-``", then the output will also be sent to standard output.
.. option:: -dispatch=<width>
Specify a different dispatch width for the processor. The dispatch width
defaults to the 'IssueWidth' specified by the processor scheduling model.
If width is zero, then the default dispatch width is used.
defaults to field 'IssueWidth' in the processor scheduling model. If width is
zero, then the default dispatch width is used.
.. option:: -register-file-size=<size>
Specify the size of the register file. When specified, this flag limits
how many temporary registers are available for register renaming purposes. By
default, the number of temporary registers is unlimited. A value of zero for
this flag means "unlimited number of temporary registers".
Specify the size of the register file. When specified, this flag limits how
many temporary registers are available for register renaming purposes. A value
of zero for this flag means "unlimited number of temporary registers".
.. option:: -iterations=<number of iterations>

View File

@ -34,9 +34,7 @@ the purpose of scheduling instructions (and therefore not described by the
scheduling model), but are very important for this tool.
A few examples of details that are missing in scheduling models are:
- Maximum number of instructions retired per cycle.
- Actual dispatch width (it often differs from the issue width).
- Number of temporary registers available for renaming.
- Number of read/write ports in the register file(s).
- Length of the load/store queue in the LSUnit.
@ -387,17 +385,17 @@ An instruction can be dispatched if:
- There are enough temporary registers to do register renaming
- Schedulers are not full.
Scheduling models don't describe register files, and therefore the tool doesn't
know if there is more than one register file, and how many temporaries are
available for register renaming.
Since r329067, scheduling models can now optionally specify which register files
are available on the processor. Class DispatchUnit(see Dispatch.h) would use
that information to initialize register file descriptors.
By default, the tool (optimistically) assumes a single register file with an
unbounded number of temporary registers. Users can limit the number of
temporary registers available for register renaming using flag
`-register-file-size=<N>`, where N is the number of temporaries. A value of
zero for N means 'unbounded'. Knowing how many temporaries are available for
register renaming, the tool can predict dispatch stalls caused by the lack of
temporaries.
By default, if the model doesn't describe register files, the tool
(optimistically) assumes a single register file with an unbounded number of
temporary registers. Users can limit the number of temporary registers that are
globally available for register renaming using flag `-register-file-size=<N>`,
where N is the number of temporaries. A value of zero for N means 'unbounded'.
Knowing how many temporaries are available for register renaming, the tool can
predict dispatch stalls caused by the lack of temporaries.
The number of reorder buffer entries consumed by an instruction depends on the
number of micro-opcodes it specifies in the target scheduling model (see field
@ -667,25 +665,6 @@ instructions are not evaluated, and therefore control flow is not affected.
However, the tool still queries the processor scheduling model to obtain latency
information for instructions that affect the control flow.
Possible extensions to the scheduling model
-------------------------------------------
Section "Instruction Dispatch" explained how the tool doesn't know about the
register files, and temporaries available in each register file for register
renaming purposes.
The LLVM scheduling model could be extended to better describe register files.
Ideally, scheduling model should be able to define:
- The size of each register file
- How many temporary registers are available for register renaming
- How register classes map to register files
The scheduling model doesn't specify the retire throughput (i.e. how many
instructions can be retired every cycle). Users can specify flag
`-max-retire-per-cycle=<uint>` to limit how many instructions the retire control
unit can retire every cycle. Ideally, every processor should be able to specify
the retire throughput (for example, by adding an extra field to the scheduling
model tablegen class).
Known limitations on X86 processors
-----------------------------------
@ -867,8 +846,6 @@ analysis.
Future work
-----------
* Address limitations (described in section "Known limitations").
* Integrate extra description in the processor models, and make it opt-in for
the targets (see section "Possible extensions to the scheduling model").
* Let processors specify the selection strategy for processor resource groups
and resources with multiple units. The tool currently uses a round-robin
selector to pick the next resource to use.
@ -877,8 +854,11 @@ Future work
* Address design issues identified in section "Known design problems".
* Define a standard interface for "Views". This would let users customize the
performance report generated by the tool.
* Simplify the Backend interface.
When interfaces are mature/stable:
* Move the logic into a library. This will enable a number of other
interesting use cases.
Work is currently tracked on https://bugs.llvm.org. llvm-mca bugs are tagged
with prefix [llvm-mca]. You can easily find the full list of open bugs if you
search for that tag.