forked from OSchip/llvm-project
[documentation][llvm-mca] Update the documentation.
Scheduling models can now describe processor register files and retire control units. This updates the existing documentation and the README file. llvm-svn: 329311
This commit is contained in:
parent
6ecdb03f16
commit
efc3f39f02
|
@ -65,15 +65,14 @@ option specifies "``-``", then the output will also be sent to standard output.
|
|||
.. option:: -dispatch=<width>
|
||||
|
||||
Specify a different dispatch width for the processor. The dispatch width
|
||||
defaults to the 'IssueWidth' specified by the processor scheduling model.
|
||||
If width is zero, then the default dispatch width is used.
|
||||
defaults to field 'IssueWidth' in the processor scheduling model. If width is
|
||||
zero, then the default dispatch width is used.
|
||||
|
||||
.. option:: -register-file-size=<size>
|
||||
|
||||
Specify the size of the register file. When specified, this flag limits
|
||||
how many temporary registers are available for register renaming purposes. By
|
||||
default, the number of temporary registers is unlimited. A value of zero for
|
||||
this flag means "unlimited number of temporary registers".
|
||||
Specify the size of the register file. When specified, this flag limits how
|
||||
many temporary registers are available for register renaming purposes. A value
|
||||
of zero for this flag means "unlimited number of temporary registers".
|
||||
|
||||
.. option:: -iterations=<number of iterations>
|
||||
|
||||
|
|
|
@ -34,9 +34,7 @@ the purpose of scheduling instructions (and therefore not described by the
|
|||
scheduling model), but are very important for this tool.
|
||||
|
||||
A few examples of details that are missing in scheduling models are:
|
||||
- Maximum number of instructions retired per cycle.
|
||||
- Actual dispatch width (it often differs from the issue width).
|
||||
- Number of temporary registers available for renaming.
|
||||
- Number of read/write ports in the register file(s).
|
||||
- Length of the load/store queue in the LSUnit.
|
||||
|
||||
|
@ -387,17 +385,17 @@ An instruction can be dispatched if:
|
|||
- There are enough temporary registers to do register renaming
|
||||
- Schedulers are not full.
|
||||
|
||||
Scheduling models don't describe register files, and therefore the tool doesn't
|
||||
know if there is more than one register file, and how many temporaries are
|
||||
available for register renaming.
|
||||
Since r329067, scheduling models can now optionally specify which register files
|
||||
are available on the processor. Class DispatchUnit(see Dispatch.h) would use
|
||||
that information to initialize register file descriptors.
|
||||
|
||||
By default, the tool (optimistically) assumes a single register file with an
|
||||
unbounded number of temporary registers. Users can limit the number of
|
||||
temporary registers available for register renaming using flag
|
||||
`-register-file-size=<N>`, where N is the number of temporaries. A value of
|
||||
zero for N means 'unbounded'. Knowing how many temporaries are available for
|
||||
register renaming, the tool can predict dispatch stalls caused by the lack of
|
||||
temporaries.
|
||||
By default, if the model doesn't describe register files, the tool
|
||||
(optimistically) assumes a single register file with an unbounded number of
|
||||
temporary registers. Users can limit the number of temporary registers that are
|
||||
globally available for register renaming using flag `-register-file-size=<N>`,
|
||||
where N is the number of temporaries. A value of zero for N means 'unbounded'.
|
||||
Knowing how many temporaries are available for register renaming, the tool can
|
||||
predict dispatch stalls caused by the lack of temporaries.
|
||||
|
||||
The number of reorder buffer entries consumed by an instruction depends on the
|
||||
number of micro-opcodes it specifies in the target scheduling model (see field
|
||||
|
@ -667,25 +665,6 @@ instructions are not evaluated, and therefore control flow is not affected.
|
|||
However, the tool still queries the processor scheduling model to obtain latency
|
||||
information for instructions that affect the control flow.
|
||||
|
||||
Possible extensions to the scheduling model
|
||||
-------------------------------------------
|
||||
Section "Instruction Dispatch" explained how the tool doesn't know about the
|
||||
register files, and temporaries available in each register file for register
|
||||
renaming purposes.
|
||||
|
||||
The LLVM scheduling model could be extended to better describe register files.
|
||||
Ideally, scheduling model should be able to define:
|
||||
- The size of each register file
|
||||
- How many temporary registers are available for register renaming
|
||||
- How register classes map to register files
|
||||
|
||||
The scheduling model doesn't specify the retire throughput (i.e. how many
|
||||
instructions can be retired every cycle). Users can specify flag
|
||||
`-max-retire-per-cycle=<uint>` to limit how many instructions the retire control
|
||||
unit can retire every cycle. Ideally, every processor should be able to specify
|
||||
the retire throughput (for example, by adding an extra field to the scheduling
|
||||
model tablegen class).
|
||||
|
||||
Known limitations on X86 processors
|
||||
-----------------------------------
|
||||
|
||||
|
@ -867,8 +846,6 @@ analysis.
|
|||
Future work
|
||||
-----------
|
||||
* Address limitations (described in section "Known limitations").
|
||||
* Integrate extra description in the processor models, and make it opt-in for
|
||||
the targets (see section "Possible extensions to the scheduling model").
|
||||
* Let processors specify the selection strategy for processor resource groups
|
||||
and resources with multiple units. The tool currently uses a round-robin
|
||||
selector to pick the next resource to use.
|
||||
|
@ -877,8 +854,11 @@ Future work
|
|||
* Address design issues identified in section "Known design problems".
|
||||
* Define a standard interface for "Views". This would let users customize the
|
||||
performance report generated by the tool.
|
||||
* Simplify the Backend interface.
|
||||
|
||||
When interfaces are mature/stable:
|
||||
* Move the logic into a library. This will enable a number of other
|
||||
interesting use cases.
|
||||
|
||||
Work is currently tracked on https://bugs.llvm.org. llvm-mca bugs are tagged
|
||||
with prefix [llvm-mca]. You can easily find the full list of open bugs if you
|
||||
search for that tag.
|
||||
|
|
Loading…
Reference in New Issue