forked from OSchip/llvm-project
[documentation][llvm-mca] Update the documentation.
Scheduling models can now describe processor register files and retire control units. This updates the existing documentation and the README file. llvm-svn: 329311
This commit is contained in:
parent
6ecdb03f16
commit
efc3f39f02
|
@ -65,15 +65,14 @@ option specifies "``-``", then the output will also be sent to standard output.
|
||||||
.. option:: -dispatch=<width>
|
.. option:: -dispatch=<width>
|
||||||
|
|
||||||
Specify a different dispatch width for the processor. The dispatch width
|
Specify a different dispatch width for the processor. The dispatch width
|
||||||
defaults to the 'IssueWidth' specified by the processor scheduling model.
|
defaults to field 'IssueWidth' in the processor scheduling model. If width is
|
||||||
If width is zero, then the default dispatch width is used.
|
zero, then the default dispatch width is used.
|
||||||
|
|
||||||
.. option:: -register-file-size=<size>
|
.. option:: -register-file-size=<size>
|
||||||
|
|
||||||
Specify the size of the register file. When specified, this flag limits
|
Specify the size of the register file. When specified, this flag limits how
|
||||||
how many temporary registers are available for register renaming purposes. By
|
many temporary registers are available for register renaming purposes. A value
|
||||||
default, the number of temporary registers is unlimited. A value of zero for
|
of zero for this flag means "unlimited number of temporary registers".
|
||||||
this flag means "unlimited number of temporary registers".
|
|
||||||
|
|
||||||
.. option:: -iterations=<number of iterations>
|
.. option:: -iterations=<number of iterations>
|
||||||
|
|
||||||
|
|
|
@ -34,9 +34,7 @@ the purpose of scheduling instructions (and therefore not described by the
|
||||||
scheduling model), but are very important for this tool.
|
scheduling model), but are very important for this tool.
|
||||||
|
|
||||||
A few examples of details that are missing in scheduling models are:
|
A few examples of details that are missing in scheduling models are:
|
||||||
- Maximum number of instructions retired per cycle.
|
|
||||||
- Actual dispatch width (it often differs from the issue width).
|
- Actual dispatch width (it often differs from the issue width).
|
||||||
- Number of temporary registers available for renaming.
|
|
||||||
- Number of read/write ports in the register file(s).
|
- Number of read/write ports in the register file(s).
|
||||||
- Length of the load/store queue in the LSUnit.
|
- Length of the load/store queue in the LSUnit.
|
||||||
|
|
||||||
|
@ -387,17 +385,17 @@ An instruction can be dispatched if:
|
||||||
- There are enough temporary registers to do register renaming
|
- There are enough temporary registers to do register renaming
|
||||||
- Schedulers are not full.
|
- Schedulers are not full.
|
||||||
|
|
||||||
Scheduling models don't describe register files, and therefore the tool doesn't
|
Since r329067, scheduling models can now optionally specify which register files
|
||||||
know if there is more than one register file, and how many temporaries are
|
are available on the processor. Class DispatchUnit(see Dispatch.h) would use
|
||||||
available for register renaming.
|
that information to initialize register file descriptors.
|
||||||
|
|
||||||
By default, the tool (optimistically) assumes a single register file with an
|
By default, if the model doesn't describe register files, the tool
|
||||||
unbounded number of temporary registers. Users can limit the number of
|
(optimistically) assumes a single register file with an unbounded number of
|
||||||
temporary registers available for register renaming using flag
|
temporary registers. Users can limit the number of temporary registers that are
|
||||||
`-register-file-size=<N>`, where N is the number of temporaries. A value of
|
globally available for register renaming using flag `-register-file-size=<N>`,
|
||||||
zero for N means 'unbounded'. Knowing how many temporaries are available for
|
where N is the number of temporaries. A value of zero for N means 'unbounded'.
|
||||||
register renaming, the tool can predict dispatch stalls caused by the lack of
|
Knowing how many temporaries are available for register renaming, the tool can
|
||||||
temporaries.
|
predict dispatch stalls caused by the lack of temporaries.
|
||||||
|
|
||||||
The number of reorder buffer entries consumed by an instruction depends on the
|
The number of reorder buffer entries consumed by an instruction depends on the
|
||||||
number of micro-opcodes it specifies in the target scheduling model (see field
|
number of micro-opcodes it specifies in the target scheduling model (see field
|
||||||
|
@ -667,25 +665,6 @@ instructions are not evaluated, and therefore control flow is not affected.
|
||||||
However, the tool still queries the processor scheduling model to obtain latency
|
However, the tool still queries the processor scheduling model to obtain latency
|
||||||
information for instructions that affect the control flow.
|
information for instructions that affect the control flow.
|
||||||
|
|
||||||
Possible extensions to the scheduling model
|
|
||||||
-------------------------------------------
|
|
||||||
Section "Instruction Dispatch" explained how the tool doesn't know about the
|
|
||||||
register files, and temporaries available in each register file for register
|
|
||||||
renaming purposes.
|
|
||||||
|
|
||||||
The LLVM scheduling model could be extended to better describe register files.
|
|
||||||
Ideally, scheduling model should be able to define:
|
|
||||||
- The size of each register file
|
|
||||||
- How many temporary registers are available for register renaming
|
|
||||||
- How register classes map to register files
|
|
||||||
|
|
||||||
The scheduling model doesn't specify the retire throughput (i.e. how many
|
|
||||||
instructions can be retired every cycle). Users can specify flag
|
|
||||||
`-max-retire-per-cycle=<uint>` to limit how many instructions the retire control
|
|
||||||
unit can retire every cycle. Ideally, every processor should be able to specify
|
|
||||||
the retire throughput (for example, by adding an extra field to the scheduling
|
|
||||||
model tablegen class).
|
|
||||||
|
|
||||||
Known limitations on X86 processors
|
Known limitations on X86 processors
|
||||||
-----------------------------------
|
-----------------------------------
|
||||||
|
|
||||||
|
@ -867,8 +846,6 @@ analysis.
|
||||||
Future work
|
Future work
|
||||||
-----------
|
-----------
|
||||||
* Address limitations (described in section "Known limitations").
|
* Address limitations (described in section "Known limitations").
|
||||||
* Integrate extra description in the processor models, and make it opt-in for
|
|
||||||
the targets (see section "Possible extensions to the scheduling model").
|
|
||||||
* Let processors specify the selection strategy for processor resource groups
|
* Let processors specify the selection strategy for processor resource groups
|
||||||
and resources with multiple units. The tool currently uses a round-robin
|
and resources with multiple units. The tool currently uses a round-robin
|
||||||
selector to pick the next resource to use.
|
selector to pick the next resource to use.
|
||||||
|
@ -877,8 +854,11 @@ Future work
|
||||||
* Address design issues identified in section "Known design problems".
|
* Address design issues identified in section "Known design problems".
|
||||||
* Define a standard interface for "Views". This would let users customize the
|
* Define a standard interface for "Views". This would let users customize the
|
||||||
performance report generated by the tool.
|
performance report generated by the tool.
|
||||||
* Simplify the Backend interface.
|
|
||||||
|
|
||||||
When interfaces are mature/stable:
|
When interfaces are mature/stable:
|
||||||
* Move the logic into a library. This will enable a number of other
|
* Move the logic into a library. This will enable a number of other
|
||||||
interesting use cases.
|
interesting use cases.
|
||||||
|
|
||||||
|
Work is currently tracked on https://bugs.llvm.org. llvm-mca bugs are tagged
|
||||||
|
with prefix [llvm-mca]. You can easily find the full list of open bugs if you
|
||||||
|
search for that tag.
|
||||||
|
|
Loading…
Reference in New Issue