[LangRef] Clarify the behavior of memory access instructions when pointers/sizes aren't well-defined

This is a patch to LangRef that clarifies the behavior of load/store/memset/memcpy/memmove when the pointers or sizes are not well-defined
as well.

MSan detects a case when e.g., only lower bits of address are garbage when `-msan-check-access-address` is enabled, and it does not directly conflict with this patch because a C program should not use a pointer with undef bits and reasonable optimizations do not convert a well-defined pointer into a pointer with undef bits.

This patch contains a definition of a well-defined value as well.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D87994
This commit is contained in:
Juneyoung Lee 2020-09-21 07:12:12 +09:00
parent 51cad041e0
commit 8bd205bf1d
1 changed files with 38 additions and 4 deletions

View File

@ -3557,10 +3557,6 @@ uses with" concept would not hold.
To ensure all uses of a given register observe the same value (even if
'``undef``'), the :ref:`freeze instruction <i_freeze>` can be used.
A value is frozen if its uses see the same value.
An aggregate value or vector is frozen if its elements are frozen.
The padding of an aggregate isn't considered, since it isn't visible
without storing it into memory and loading it with a different type.
.. code-block:: llvm
@ -3733,6 +3729,23 @@ Here are some examples:
end:
.. _welldefinedvalues:
Well-Defined Values
-------------------
Given a program execution, a value is *well defined* if the value does not
have an undef bit and is not poison in the execution.
An aggregate value or vector is well defined if its elements are well defined.
The padding of an aggregate isn't considered, since it isn't visible
without storing it into memory and loading it with a different type.
A constant of a :ref:`single value <t_single_value>`, non-vector type is well
defined if it is a non-undef constant. Note that there is no poison constant
in LLVM.
The result of :ref:`freeze instruction <i_freeze>` is well defined regardless
of its operand.
.. _blockaddress:
Addresses of Basic Blocks
@ -9248,6 +9261,12 @@ If the value being loaded is of aggregate type, the bytes that correspond to
padding may be accessed but are ignored, because it is impossible to observe
padding from the loaded aggregate value.
If the pointer is not a well-defined value, all of its possible representations
should be dereferenceable. For example, loading a byte from a pointer to an
array of type ``[16 x i8]`` with offset ``undef & 31`` is undefined behavior.
Loading a byte at offset ``undef & 15`` nondeterministically reads one of the
bytes.
Examples:
"""""""""
@ -9339,6 +9358,12 @@ belong to the type, but they will typically be overwritten.
If ``<value>`` is of aggregate type, padding is filled with
:ref:`undef <undefvalues>`.
If ``<pointer>`` is not a well-defined value, all of its possible
representations should be dereferenceable. For example, storing a byte to a
pointer to an array of type ``[16 x i8]`` with offset ``undef & 31`` is
undefined behavior. Storing a byte to an offset ``undef & 15``
nondeterministically stores to one of offsets from 0 to 15.
Example:
""""""""
@ -12491,6 +12516,9 @@ argument.
If "len" is 0, the pointers may be NULL, dangling, ``undef``, or ``poison``
pointers. However, they must still be appropriately aligned.
If "len" isn't a well-defined value, all of its possible representations should
make the behavior of this ``llvm.memcpy`` defined, otherwise the behavior is
undefined.
.. _int_memcpy_inline:
@ -12608,6 +12636,9 @@ the argument.
If "len" is 0, the pointers may be NULL, dangling, ``undef``, or ``poison``
pointers. However, they must still be appropriately aligned.
If "len" isn't a well-defined value, all of its possible representations should
make the behavior of this ``llvm.memmove`` defined, otherwise the behavior is
undefined.
.. _int_memset:
@ -12663,6 +12694,9 @@ the argument.
If "len" is 0, the pointer may be NULL, dangling, ``undef``, or ``poison``
pointer. However, it must still be appropriately aligned.
If "len" isn't a well-defined value, all of its possible representations should
make the behavior of this ``llvm.memset`` defined, otherwise the behavior is
undefined.
'``llvm.sqrt.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^