forked from OSchip/llvm-project
[docs] Add page on opaque pointer types
Reviewed By: dblaikie, dexonsmith Differential Revision: https://reviews.llvm.org/D102292
This commit is contained in:
parent
71a0609a2b
commit
772bdef6af
|
@ -0,0 +1,130 @@
|
|||
===============
|
||||
Opaque Pointers
|
||||
===============
|
||||
|
||||
The Opaque Pointer Type
|
||||
=======================
|
||||
|
||||
Traditionally, LLVM IR pointer types have contained a pointee type. For example,
|
||||
``i32 *`` is a pointer that points to an ``i32`` somewhere in memory. However,
|
||||
due to a lack of pointee type semantics and various issues with having pointee
|
||||
types, there is a desire to remove pointee types from pointers.
|
||||
|
||||
The opaque pointer type project aims to replace all pointer types containing
|
||||
pointee types in LLVM with an opaque pointer type. The new pointer type is
|
||||
tentatively represented textually as ``ptr``.
|
||||
|
||||
Anything to do with pointer address spaces is unaffected.
|
||||
|
||||
Issues with explicit pointee types
|
||||
==================================
|
||||
|
||||
LLVM IR pointers can be cast back and forth between pointers with different
|
||||
pointee types. The pointee type does not necessarily actually represent the
|
||||
actual underlying type in memory. In other words, the pointee type contains no
|
||||
real semantics.
|
||||
|
||||
Lots of operations do not actually care about the underlying type. These
|
||||
operations, typically intrinsics, usually end up taking an ``i8 *``. This causes
|
||||
lots of redundant no-op bitcasts in the IR to and from a pointer with a
|
||||
different pointee type. The extra bitcasts take up space and require extra work
|
||||
to look through in optimizations. And more bitcasts increases the chances of
|
||||
incorrect bitcasts, especially in regards to address spaces.
|
||||
|
||||
Some instructions still need to know what type to treat the memory pointed to by
|
||||
the pointer as. For example, a load needs to know how many bytes to load from
|
||||
memory. In these cases, instructions themselves contain a type argument. For
|
||||
example the load instruction from older versions of LLVM
|
||||
|
||||
.. code-block:: llvm
|
||||
|
||||
load i64* %p
|
||||
|
||||
becomes
|
||||
|
||||
.. code-block:: llvm
|
||||
|
||||
load i64, ptr %p
|
||||
|
||||
A nice analogous transition that happened earlier in LLVM is integer signedness.
|
||||
There is no distinction between signed and unsigned integer types, rather the
|
||||
integer operations themselves contain what to treat the integer as. Initially,
|
||||
LLVM IR distinguished between unsigned and signed integer types. The transition
|
||||
from manifesting signedness in types to instructions happened early on in LLVM's
|
||||
life to the betterment of LLVM IR.
|
||||
|
||||
I Still Need Pointee Types!
|
||||
===========================
|
||||
|
||||
The frontend should already know what type each operation operates on based on
|
||||
the input source code. However, some frontends like Clang may end up relying on
|
||||
LLVM pointer pointee types to keep track of pointee types. The frontend needs to
|
||||
keep track of frontend pointee types on its own.
|
||||
|
||||
For optimizations around frontend types, pointee types are not useful due their
|
||||
lack of semantics. Rather, since LLVM IR works on untyped memory, for a frontend
|
||||
to tell LLVM about frontend types for the purposes of alias analysis, extra
|
||||
metadata is added to the IR. For more information, see `TBAA
|
||||
<LangRef.html#tbaa-metadata>`_.
|
||||
|
||||
Some specific operations still need to know what type a pointer types to. For
|
||||
the most part, this is codegen and ABI specific. For example, `byval
|
||||
<LangRef.html#parameter-attributes>`_ arguments are pointers, but backends need
|
||||
to know the underlying type of the argument to properly lower it. In cases like
|
||||
these, the attributes contain a type argument. For example,
|
||||
|
||||
.. code-block:: llvm
|
||||
|
||||
call void @f(ptr byval(i32) %p)
|
||||
|
||||
signifies that ``%p`` as an argument should be lowered as an ``i32`` passed
|
||||
indirectly.
|
||||
|
||||
If you have use cases that this sort of fix doesn't cover, please email
|
||||
llvm-dev.
|
||||
|
||||
Transition Plan
|
||||
===============
|
||||
|
||||
LLVM currently has many places that depend on pointee types. Each dependency on
|
||||
pointee types needs to be resolved in some way or another.
|
||||
|
||||
Making everything use opaque pointers in one huge commit is infeasible. This
|
||||
needs to be done incrementally. The following steps need to be done, in no
|
||||
particular order:
|
||||
|
||||
* Introduce the opaque pointer type
|
||||
|
||||
* Various ABI attributes and instructions that need a type can be changed one at
|
||||
a time
|
||||
|
||||
* This has already happened for many instructions like loads, stores, GEPs,
|
||||
and various attributes like ``byval``
|
||||
|
||||
* Fix up existing in-tree users of pointee types to not rely on LLVM pointer
|
||||
pointee types
|
||||
|
||||
* Allow bitcode auto-upgrade of legacy pointer type to the new opaque pointer
|
||||
type (not to be turned on until ready)
|
||||
|
||||
* Migrate frontends to not keep track of frontend pointee types via LLVM pointer
|
||||
pointee types
|
||||
|
||||
* Add option to internally treat all pointer types opaque pointers and see what
|
||||
breaks, starting with LLVM tests, then run Clang over large codebases
|
||||
|
||||
* Replace legacy pointer types in LLVM tests with opaque pointer types
|
||||
|
||||
Frontend Migration Steps
|
||||
========================
|
||||
|
||||
If you have your own frontend, there are a couple of things to do after opaque
|
||||
pointer types fully work.
|
||||
|
||||
* Don't rely on LLVM pointee types to keep track of frontend pointee types
|
||||
|
||||
* Migrate away from LLVM IR instruction builders that rely on pointee types
|
||||
|
||||
* For example, ``IRBuilder::CreateGEP()`` has multiple overloads; make sure to
|
||||
use one where the source element type is explicitly passed in, not inferred
|
||||
from the pointer operand pointee type
|
|
@ -44,6 +44,7 @@ intermediate LLVM representation.
|
|||
MergeFunctions
|
||||
MCJITDesignAndImplementation
|
||||
ORCv2
|
||||
OpaquePointers
|
||||
JITLink
|
||||
NewPassManager
|
||||
NVPTXUsage
|
||||
|
|
Loading…
Reference in New Issue