[LangRef] Clarify poison semantics

I find the current documentation of poison somewhat confusing,
mainly because its use of "undefined behavior" doesn't seem to
align with our usual interpretation (of immediate UB). Especially
the sentence "any instruction that has a dependence on a poison
value has undefined behavior" is very confusing.

Clarify poison semantics by:

 * Replacing the introductory paragraph with the standard rationale
   for having poison values.
 * Spelling out that instructions depending on poison return poison.
 * Spelling out how we go from a poison value to immediate undefined
   behavior and give the two examples we currently use in ValueTracking.
 * Spelling out that side effects depending on poison are UB.

Differential Revision: https://reviews.llvm.org/D63044

llvm-svn: 363320
This commit is contained in:
Nikita Popov 2019-06-13 19:45:36 +00:00
parent 24f4085811
commit ad81d427ca
2 changed files with 25 additions and 10 deletions

View File

@ -2153,6 +2153,8 @@ to the following rules:
address range of the allocated storage.
- A null pointer in the default address-space is associated with no
address.
- An :ref:`undef value <undefvalues>` in *any* address-space is
associated with no address.
- An integer constant other than zero or a pointer value returned from
a function not defined within LLVM may be associated with address
ranges allocated through mechanisms other than those provided by
@ -3205,10 +3207,9 @@ behavior.
Poison Values
-------------
Poison values are similar to :ref:`undef values <undefvalues>`, however
they also represent the fact that an instruction or constant expression
that cannot evoke side effects has nevertheless detected a condition
that results in undefined behavior.
In order to facilitate speculative execution, many instructions do not
invoke immediate undefined behavior when provided with illegal operands,
and return a poison value instead.
There is currently no way of representing a poison value in the IR; they
only exist when produced by operations such as :ref:`add <i_add>` with
@ -3245,9 +3246,22 @@ Poison value behavior is defined in terms of value *dependence*:
successor.
- Dependence is transitive.
Poison values have the same behavior as :ref:`undef values <undefvalues>`,
with the additional effect that any instruction that has a *dependence*
on a poison value has undefined behavior.
An instruction that *depends* on a poison value, produces a poison value
itself. A poison value may be relaxed into an
:ref:`undef value <undefvalues>`, which takes an arbitrary bit-pattern.
This means that immediate undefined behavior occurs if a poison value is
used as an instruction operand that has any values that trigger undefined
behavior. Notably this includes (but is not limited to):
- The pointer operand of a :ref:`load <i_load>`, :ref:`store <i_store>` or
any other pointer dereferencing instruction (independent of address
space).
- The divisor operand of a ``udiv``, ``sdiv``, ``urem`` or ``srem``
instruction.
Additionally, undefined behavior occurs if a side effect *depends* on poison.
This includes side effects that are control dependent on a poisoned branch.
Here are some examples:
@ -3257,13 +3271,12 @@ Here are some examples:
%poison = sub nuw i32 0, 1 ; Results in a poison value.
%still_poison = and i32 %poison, 0 ; 0, but also poison.
%poison_yet_again = getelementptr i32, i32* @h, i32 %still_poison
store i32 0, i32* %poison_yet_again ; memory at @h[0] is poisoned
store i32 0, i32* %poison_yet_again ; Undefined behavior due to
; store to poison.
store i32 %poison, i32* @g ; Poison value stored to memory.
%poison2 = load i32, i32* @g ; Poison value loaded back from memory.
store volatile i32 %poison, i32* @g ; External observation; undefined behavior.
%narrowaddr = bitcast i32* @g to i16*
%wideaddr = bitcast i32* @g to i64*
%poison3 = load i16, i16* %narrowaddr ; Returns a poison value.

View File

@ -4336,6 +4336,8 @@ bool llvm::isGuaranteedToExecuteForEveryIteration(const Instruction *I,
}
bool llvm::propagatesFullPoison(const Instruction *I) {
// TODO: This should include all instructions apart from phis, selects and
// call-like instructions.
switch (I->getOpcode()) {
case Instruction::Add:
case Instruction::Sub: