forked from OSchip/llvm-project
[Docs] Update opaque pointer docs (NFC)
Mention -opaque-pointers, write a bit more about migration pitfalls and update the open issues.
This commit is contained in:
parent
a1da73961d
commit
3bbf7f5ed8
|
@ -1199,7 +1199,7 @@ Currently, only the following parameter attributes are defined:
|
|||
|
||||
The ``elementtype`` argument attribute can be used to specify a pointer
|
||||
element type in a way that is compatible with `opaque pointers
|
||||
<OpaquePointers.html>`.
|
||||
<OpaquePointers.html>`_.
|
||||
|
||||
The ``elementtype`` attribute by itself does not carry any specific
|
||||
semantics. However, certain intrinsics may require this attribute to be
|
||||
|
|
|
@ -6,7 +6,7 @@ The Opaque Pointer Type
|
|||
=======================
|
||||
|
||||
Traditionally, LLVM IR pointer types have contained a pointee type. For example,
|
||||
``i32 *`` is a pointer that points to an ``i32`` somewhere in memory. However,
|
||||
``i32*`` is a pointer that points to an ``i32`` somewhere in memory. However,
|
||||
due to a lack of pointee type semantics and various issues with having pointee
|
||||
types, there is a desire to remove pointee types from pointers.
|
||||
|
||||
|
@ -29,7 +29,7 @@ actual underlying type in memory. In other words, the pointee type contains no
|
|||
real semantics.
|
||||
|
||||
Lots of operations do not actually care about the underlying type. These
|
||||
operations, typically intrinsics, usually end up taking an ``i8 *``. This causes
|
||||
operations, typically intrinsics, usually end up taking an ``i8*``. This causes
|
||||
lots of redundant no-op bitcasts in the IR to and from a pointer with a
|
||||
different pointee type. The extra bitcasts take up space and require extra work
|
||||
to look through in optimizations. And more bitcasts increases the chances of
|
||||
|
@ -57,6 +57,35 @@ LLVM IR distinguished between unsigned and signed integer types. The transition
|
|||
from manifesting signedness in types to instructions happened early on in LLVM's
|
||||
life to the betterment of LLVM IR.
|
||||
|
||||
Opaque Pointers Mode
|
||||
====================
|
||||
|
||||
During the transition phase, LLVM can be used in two modes: In typed pointer
|
||||
mode (currently still the default) all pointer types have a pointee type and
|
||||
opaque pointers cannot be used. In opaque pointers mode, all pointers are
|
||||
opaque. The opaque pointer mode can be enabled using ``-opaque-pointers`` in
|
||||
LLVM tools like ``opt``, or ``-mllvm -opaque-pointers`` in clang.
|
||||
|
||||
In opaque pointer mode, all typed pointers used in IR, bitcode, or created
|
||||
using ``PointerType::get()`` and similar APIs are automatically converted into
|
||||
opaque pointers. This simplifies migration and allows testing existing IR with
|
||||
opaque pointers.
|
||||
|
||||
.. code-block:: llvm
|
||||
|
||||
define i8* @test(i8* %p) {
|
||||
%p2 = getelementptr i8, i8* %p, i64 1
|
||||
ret i8* %p2
|
||||
}
|
||||
|
||||
; Is automatically converted into the following if -opaque-pointers
|
||||
; is enabled:
|
||||
|
||||
define ptr @test(ptr %p) {
|
||||
%p2 = getelementptr i8, ptr %p, i64 1
|
||||
ret ptr %p2
|
||||
}
|
||||
|
||||
I Still Need Pointee Types!
|
||||
===========================
|
||||
|
||||
|
@ -87,73 +116,92 @@ indirectly.
|
|||
If you have use cases that this sort of fix doesn't cover, please email
|
||||
llvm-dev.
|
||||
|
||||
Transition Plan
|
||||
===============
|
||||
Migration Instructions
|
||||
======================
|
||||
|
||||
LLVM currently has many places that depend on pointee types. Each dependency on
|
||||
pointee types needs to be resolved in some way or another. This essentially
|
||||
translates to figuring out how to remove all calls to
|
||||
``PointerType::getElementType`` and ``Type::getPointerElementType()``.
|
||||
In order to support opaque pointers, two types of changes tend to be necessary.
|
||||
The first is the removal of all calls to ``PointerType::getElementType()`` and
|
||||
``Type::getPointerElementType()``.
|
||||
|
||||
Making everything use opaque pointers in one huge commit is infeasible. This
|
||||
needs to be done incrementally. The following steps need to be done, in no
|
||||
particular order:
|
||||
In the LLVM middle-end and backend, this is usually accomplished by inspecting
|
||||
the type of relevant operations instead. For example, memory access related
|
||||
analyses and optimizations should use the types encoded in the load and store
|
||||
instructions instead of querying the pointer type.
|
||||
|
||||
* Introduce the opaque pointer type
|
||||
Frontends need to be adjusted to track pointee types independently of LLVM,
|
||||
insofar as they are necessary for lowering. For example, clang now tracks the
|
||||
pointee type in the ``Address`` structure.
|
||||
|
||||
* Already done
|
||||
While direct usage of pointer element types is immediately apparent in code,
|
||||
there is a more subtle issue that opaque pointers need to contend with: A lot
|
||||
of code assumes that pointer equality also implies that the used load/store
|
||||
type is the same. Consider the following examples with typed an opaque pointers:
|
||||
|
||||
* Remove remaining in-tree users of pointee types
|
||||
.. code-block:: llvm
|
||||
|
||||
* There are many miscellaneous uses that should be cleaned up individually
|
||||
define i32 @test(i32* %p) {
|
||||
store i32 0, i32* %p
|
||||
%bc = bitcast i32* %p to i64*
|
||||
%v = load i64, i64* %bc
|
||||
ret i64 %v
|
||||
}
|
||||
|
||||
* Some of the larger use cases are mentioned below
|
||||
define i32 @test(ptr %p) {
|
||||
store i32 0, ptr %p
|
||||
%v = load i64, ptr %p
|
||||
ret i64 %v
|
||||
}
|
||||
|
||||
* Various ABI attributes and instructions that rely on pointee types need to be
|
||||
modified to specify the type separately
|
||||
Without opaque pointers, a check that the pointer operand of the load and
|
||||
store are the same also ensures that the accessed type is the same. Using a
|
||||
different type requires a bitcast, which will result in distinct pointer
|
||||
operands.
|
||||
|
||||
* This has already happened for all instructions like loads, stores, GEPs,
|
||||
and various attributes like ``byval``
|
||||
With opaque pointers, the bitcast is not present, and this check is no longer
|
||||
sufficient. In the above example, it could result in store to load forwarding
|
||||
of an incorrect type. Code making such assumptions needs to be adjusted to
|
||||
check the accessed type explicitly:
|
||||
``LI->getType() == SI->getValueOperand()->getType()``.
|
||||
|
||||
* More cases may be found as work continues
|
||||
Frontends using the C API through an FFI interface should be aware that a
|
||||
number of C API functions are deprecated and will be removed as part of the
|
||||
opaque pointer transition::
|
||||
|
||||
* Remove calls to and deprecate ``IRBuilder`` methods that rely on pointee types
|
||||
LLVMBuildLoad -> LLVMBuildLoad2
|
||||
LLVMBuildCall -> LLVMBuildCall2
|
||||
LLVMBuildInvoke -> LLVMBuildInvoke2
|
||||
LLVMBuildGEP -> LLVMBuildGEP2
|
||||
LLVMBuildInBoundsGEP -> LLVMBuildInBoundsGEP2
|
||||
LLVMBuildStructGEP -> LLVMBuildStructGEP2
|
||||
LLVMConstGEP -> LLVMConstGEP2
|
||||
LLVMConstInBoundsGEP -> LLVMConstInBoundsGEP2
|
||||
LLVMAddAlias -> LLVMAddAlias2
|
||||
|
||||
* For example, some of the ``IRBuilder::CreateGEP()`` methods use the pointer
|
||||
operand's pointee type to determine the GEP operand type
|
||||
Additionally, it will no longer be possible to call ``LLVMGetElementType()``
|
||||
on a pointer type.
|
||||
|
||||
* Some methods are already deprecated with ``LLVM_ATTRIBUTE_DEPRECATED``, such
|
||||
as some overloads of ``IRBuilder::CreateLoad()``
|
||||
Transition State
|
||||
================
|
||||
|
||||
* Allow bitcode auto-upgrade of legacy pointer type to the new opaque pointer
|
||||
type (not to be turned on until ready)
|
||||
As of January 2022 large parts of LLVM support opaque pointers, but there are
|
||||
still some major open problems:
|
||||
|
||||
* To support legacy bitcode, such as legacy stores/loads, we need to track
|
||||
pointee types for all values since legacy instructions may infer the types
|
||||
from a pointer operand's pointee type
|
||||
* Bitcode already fully supports opaque pointers, and reading up-to-date
|
||||
typed pointer bitcode in opaque pointers mode also works. However, we
|
||||
currently do not support pointee type based auto-upgrade of old bitcode in
|
||||
opaque pointer mode.
|
||||
|
||||
* Migrate frontends to not keep track of frontend pointee types via LLVM pointer
|
||||
pointee types
|
||||
* While clang has limited support for opaque pointers (sufficient to compile
|
||||
CTMark on Linux), a major effort will be needed to systematically remove all
|
||||
uses of ``getPointerElementType()`` and the deprecated ``Address()``
|
||||
constructor.
|
||||
|
||||
* This is mostly Clang, see ``clang::CodeGen::Address::getElementType()``
|
||||
* We do not yet have a testing strategy for how we can test both typed and
|
||||
opaque pointers during the migration. Currently, individual tests for
|
||||
opaque pointers are being added, but the bulk of tests still uses typed
|
||||
pointers.
|
||||
|
||||
* Add option to internally treat all pointer types opaque pointers and see what
|
||||
breaks, starting with LLVM tests, then run Clang over large codebases
|
||||
* Loop access analysis does not support opaque pointers yet, and is currently
|
||||
the main source of assertion failurse in optimized builds.
|
||||
|
||||
* We don't want to start mass-updating tests until we're fairly confident that opaque pointers won't cause major issues
|
||||
|
||||
* Replace legacy pointer types in LLVM tests with opaque pointer types
|
||||
|
||||
Frontend Migration Steps
|
||||
========================
|
||||
|
||||
If you have your own frontend, there are a couple of things to do after opaque
|
||||
pointer types fully work.
|
||||
|
||||
* Don't rely on LLVM pointee types to keep track of frontend pointee types
|
||||
|
||||
* Migrate away from LLVM IR instruction builders that rely on pointee types
|
||||
|
||||
* For example, ``IRBuilder::CreateGEP()`` has multiple overloads; make sure to
|
||||
use one where the source element type is explicitly passed in, not inferred
|
||||
from the pointer operand pointee type
|
||||
* Miscellanous uses of pointer element types remain everywhere.
|
||||
|
|
Loading…
Reference in New Issue