forked from OSchip/llvm-project
f5b21680d1
This builtin allows the creation of custom scheduling pipelines on a per-region basis. Like the sched_barrier builtin this is intended to be used either for testing, in situations where the default scheduler heuristics cannot be improved, or in critical kernels where users are trying to get performance that is close to handwritten assembly. Obviously using these builtins will require extra work from the kernel writer to maintain the desired behavior. The builtin can be used to create groups of instructions called "scheduling groups" where ordering between the groups is enforced by the scheduler. __builtin_amdgcn_sched_group_barrier takes three parameters. The first parameter is a mask that determines the types of instructions that you would like to synchronize around and add to a scheduling group. These instructions will be selected from the bottom up starting from the sched_group_barrier's location during instruction scheduling. The second parameter is the number of matching instructions that will be associated with this sched_group_barrier. The third parameter is an identifier which is used to describe what other sched_group_barriers should be synchronized with. Note that multiple sched_group_barriers must be added in order for them to be useful since they only synchronize with other sched_group_barriers. Only "scheduling groups" with a matching third parameter will have any enforced ordering between them. As an example, the code below tries to create a pipeline of 1 VMEM_READ instruction followed by 1 VALU instruction followed by 5 MFMA instructions... // 1 VMEM_READ __builtin_amdgcn_sched_group_barrier(32, 1, 0) // 1 VALU __builtin_amdgcn_sched_group_barrier(2, 1, 0) // 5 MFMA __builtin_amdgcn_sched_group_barrier(8, 5, 0) // 1 VMEM_READ __builtin_amdgcn_sched_group_barrier(32, 1, 0) // 3 VALU __builtin_amdgcn_sched_group_barrier(2, 3, 0) // 2 VMEM_WRITE __builtin_amdgcn_sched_group_barrier(64, 2, 0) Reviewed By: jrbyrnes Differential Revision: https://reviews.llvm.org/D128158 |
||
---|---|---|
.. | ||
bindings | ||
cmake | ||
docs | ||
examples | ||
include | ||
lib | ||
runtime | ||
test | ||
tools | ||
unittests | ||
utils | ||
www | ||
.clang-format | ||
.clang-tidy | ||
.gitignore | ||
CMakeLists.txt | ||
CODE_OWNERS.TXT | ||
INSTALL.txt | ||
LICENSE.TXT | ||
ModuleInfo.txt | ||
NOTES.txt | ||
README.txt |
README.txt
//===----------------------------------------------------------------------===// // C Language Family Front-end //===----------------------------------------------------------------------===// Welcome to Clang. This is a compiler front-end for the C family of languages (C, C++, Objective-C, and Objective-C++) which is built as part of the LLVM compiler infrastructure project. Unlike many other compiler frontends, Clang is useful for a number of things beyond just compiling code: we intend for Clang to be host to a number of different source-level tools. One example of this is the Clang Static Analyzer. If you're interested in more (including how to build Clang) it is best to read the relevant web sites. Here are some pointers: Information on Clang: http://clang.llvm.org/ Building and using Clang: http://clang.llvm.org/get_started.html Clang Static Analyzer: http://clang-analyzer.llvm.org/ Information on the LLVM project: http://llvm.org/ If you have questions or comments about Clang, a great place to discuss them is on the Clang forums: https://discourse.llvm.org/c/clang/ If you find a bug in Clang, please file it in the LLVM bug tracker: http://llvm.org/bugs/