forked from OSchip/llvm-project
[SYCL][Doc] Add design document for SYCL mode
Initial version of the document covers address space handling Differential Revision: https://reviews.llvm.org/D99488
This commit is contained in:
parent
a0677ff5eb
commit
b52e69c426
|
@ -0,0 +1,105 @@
|
|||
=============================================
|
||||
SYCL Compiler and Runtime architecture design
|
||||
=============================================
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
This document describes the architecture of the SYCL compiler and runtime
|
||||
library. More details are provided in
|
||||
`external document <https://github.com/intel/llvm/blob/sycl/sycl/doc/CompilerAndRuntimeDesign.md>`_\ ,
|
||||
which are going to be added to clang documentation in the future.
|
||||
|
||||
Address space handling
|
||||
======================
|
||||
|
||||
The SYCL specification represents pointers to disjoint memory regions using C++
|
||||
wrapper classes on an accelerator to enable compilation with a standard C++
|
||||
toolchain and a SYCL compiler toolchain. Section 3.8.2 of SYCL 2020
|
||||
specification defines
|
||||
`memory model <https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_sycl_device_memory_model>`_\ ,
|
||||
section 4.7.7 - `address space classes <https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_address_space_classes>`_
|
||||
and section 5.9 covers `address space deduction <https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_address_space_deduction>`_.
|
||||
The SYCL specification allows two modes of address space deduction: "generic as
|
||||
default address space" (see section 5.9.3) and "inferred address space" (see
|
||||
section 5.9.4). Current implementation supports only "generic as default address
|
||||
space" mode.
|
||||
|
||||
SYCL borrows its memory model from OpenCL however SYCL doesn't perform
|
||||
the address space qualifier inference as detailed in
|
||||
`OpenCL C v3.0 6.7.8 <https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#addr-spaces-inference>`_.
|
||||
|
||||
The default address space is "generic-memory", which is a virtual address space
|
||||
that overlaps the global, local, and private address spaces. SYCL mode enables
|
||||
explicit conversions to/from the default address space from/to the address
|
||||
space-attributed type and implicit conversions from the address space-attributed
|
||||
type to the default address space. All named address spaces are disjoint and
|
||||
sub-sets of default address space.
|
||||
|
||||
The SPIR target allocates SYCL namespace scope variables in the global address
|
||||
space.
|
||||
|
||||
Pointers to default address space should get lowered into a pointer to a generic
|
||||
address space (or flat to reuse more general terminology). But depending on the
|
||||
allocation context, the default address space of a non-pointer type is assigned
|
||||
to a specific address space. This is described in
|
||||
`common address space deduction rules <https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#subsec:commonAddressSpace>`_
|
||||
section.
|
||||
|
||||
This is also in line with the behaviour of CUDA (`small example
|
||||
<https://godbolt.org/z/veqTfo9PK>`_).
|
||||
|
||||
``multi_ptr`` class implementation example:
|
||||
|
||||
.. code-block:: C++
|
||||
|
||||
// check that SYCL mode is ON and we can use non-standard decorations
|
||||
#if defined(__SYCL_DEVICE_ONLY__)
|
||||
// GPU/accelerator implementation
|
||||
template <typename T, address_space AS> class multi_ptr {
|
||||
// DecoratedType applies corresponding address space attribute to the type T
|
||||
// DecoratedType<T, global_space>::type == "__attribute__((opencl_global)) T"
|
||||
// See sycl/include/CL/sycl/access/access.hpp for more details
|
||||
using pointer_t = typename DecoratedType<T, AS>::type *;
|
||||
|
||||
pointer_t m_Pointer;
|
||||
public:
|
||||
pointer_t get() { return m_Pointer; }
|
||||
T& operator* () { return *reinterpret_cast<T*>(m_Pointer); }
|
||||
}
|
||||
#else
|
||||
// CPU/host implementation
|
||||
template <typename T, address_space AS> class multi_ptr {
|
||||
T *m_Pointer; // regular undecorated pointer
|
||||
public:
|
||||
T *get() { return m_Pointer; }
|
||||
T& operator* () { return *m_Pointer; }
|
||||
}
|
||||
#endif
|
||||
|
||||
Depending on the compiler mode, ``multi_ptr`` will either decorate its internal
|
||||
data with the address space attribute or not.
|
||||
|
||||
To utilize clang's existing functionality, we reuse the following OpenCL address
|
||||
space attributes for pointers:
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
|
||||
* - Address space attribute
|
||||
- SYCL address_space enumeration
|
||||
* - ``__attribute__((opencl_global))``
|
||||
- global_space, constant_space
|
||||
* - ``__attribute__((opencl_local))``
|
||||
- local_space
|
||||
* - ``__attribute__((opencl_private))``
|
||||
- private_space
|
||||
|
||||
|
||||
.. code-block::
|
||||
|
||||
TODO: add support for `__attribute__((opencl_global_host))` and
|
||||
`__attribute__((opencl_global_device))`.
|
Loading…
Reference in New Issue