Commit Graph

23 Commits

Author SHA1 Message Date
Momchil Velikov 6398903ac8 Extend the `uwtable` attribute with unwind table kind
We have the `clang -cc1` command-line option `-funwind-tables=1|2` and
the codegen option `VALUE_CODEGENOPT(UnwindTables, 2, 0) ///< Unwind
tables (1) or asynchronous unwind tables (2)`. However, this is
encoded in LLVM IR by the presence or the absence of the `uwtable`
attribute, i.e.  we lose the information whether to generate want just
some unwind tables or asynchronous unwind tables.

Asynchronous unwind tables take more space in the runtime image, I'd
estimate something like 80-90% more, as the difference is adding
roughly the same number of CFI directives as for prologues, only a bit
simpler (e.g. `.cfi_offset reg, off` vs. `.cfi_restore reg`). Or even
more, if you consider tail duplication of epilogue blocks.
Asynchronous unwind tables could also restrict code generation to
having only a finite number of frame pointer adjustments (an example
of *not* having a finite number of `SP` adjustments is on AArch64 when
untagging the stack (MTE) in some cases the compiler can modify `SP`
in a loop).
Having the CFI precise up to an instruction generally also means one
cannot bundle together CFI instructions once the prologue is done,
they need to be interspersed with ordinary instructions, which means
extra `DW_CFA_advance_loc` commands, further increasing the unwind
tables size.

That is to say, async unwind tables impose a non-negligible overhead,
yet for the most common use cases (like C++ exceptions), they are not
even needed.

This patch extends the `uwtable` attribute with an optional
value:
      -  `uwtable` (default to `async`)
      -  `uwtable(sync)`, synchronous unwind tables
      -  `uwtable(async)`, asynchronous (instruction precise) unwind tables

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D114543
2022-02-14 14:35:02 +00:00
Johannes Doerfert b51b83f68e [Attributor] Introduce the concept of query AAs
D106720 introduced features that did not work properly as we could add
new queries after a fixpoint was reached and which could not be answered
by the information gathered up to the fixpoint alone.

As an alternative to D110078, which forced eager computation where we
want to continue to be lazy, this patch fixes the problem.

QueryAAs are AAs that allow lazy queries during their lifetime. They are
never fixed if they have no outstanding dependences and always run as
part of the updates in an iteration. To determine if we are done, all
query AAs are asked if they received new queries, if not, we only need
to consider updated AAs, as before. If new queries are present we go for
another iteration.

Differential Revision: https://reviews.llvm.org/D118669
2022-02-01 01:40:44 -06:00
Johannes Doerfert ac3ec22df9 [Attributor] Use AAFunctionReachability to determine AANoRecurse
We missed out on AANoRecurse in the module pass because we had no call
graph. With AAFunctionReachability we can simply ask if the function may
reach itself.

Differential Revision: https://reviews.llvm.org/D110099
2022-02-01 01:40:44 -06:00
Johannes Doerfert d1186ce7a9 [Attributor] Make interprocedural value explicit in genericValueTraversal
genericValueTraversal can look through arguments and allow value
simplification across function boundaries. In fact, the latter already
happened unchecked. With this change we allow the user of
genericValueTraversal to opt-out of interprocedural traversal if
required. We explicitly look through arguments now which helps to do
various things, incl. the propagation of constants into OpenMP parallel
regions (on the host).
2022-02-01 01:40:44 -06:00
Johannes Doerfert 0f471710f8 [Attributor] Use edge liveness rather than block liveness
We moved to the edge API a while back, not all uses were adjusted.
Edge liveness is more precise.
2022-02-01 01:18:51 -06:00
Johannes Doerfert adf0d57f15 [Attributor] Provide convenient helpers for isAssumedRead{None,Only}
We have two attributes that can answer readnone queries. While there is
a dependence between them, it seems best to not force the users to know
what AA to ask. The helpers also allow to check for readonly nicely.

Test changes show where we now deduce readnone but haven't before,
mostly because we only asked AAMemoryBehavior and not AAMemoryLocation.
AANoAlias has not been ported to the new API yet.
2022-02-01 01:18:51 -06:00
Johannes Doerfert e140d51319 [Attributor] Use CFG reasoning to filter potentially interfering writes
Since D104432 we can look through memory by analyzing all writes that
might interfere with a load. This patch provides some logic to exclude
writes that cannot interfere with a location, due to CFG reasoning.
We make sure to avoid multi-thread write-read situations properly while
we ignore writes that cannot reach a load or writes that will be
overwritten before the load is reached.

Differential Revision: https://reviews.llvm.org/D106397
2022-02-01 01:18:51 -06:00
Philip Reames d1f4c6a611 [Attributor] Generalize calloc handling in heap-to-stack for any init value [NFC]
Rewrite the calloc specific handling in heap-to-stack to allow arbitrary init values.  The basic problem being solved is that if an allocation is initilized to anything other than zero, this must be explicitly done for the formed alloca as well.

This covers the calloc case today, but once a couple of earlier guards are removed in this code, downstream allocators with other init values could also be handled.

Inspired by discussion on D116971
2022-01-12 16:58:39 -08:00
Johannes Doerfert 7b39dccbe4 [Attributor][FIX] Ensure "IsExact" is false for non-exact accesses
If we look at potentially interfering accesses we need to ensure the
"IsExact" flag is set appropriately. Accesses that have an "unknown"
size or offset cannot be exact matches and we missed to flag that.

Error and test reported by Serguei N. Dmitriev.
2022-01-10 10:09:36 -06:00
Johannes Doerfert 5602c866c0 [Attributor] Look through allocated heap memory
AAPointerInfo, and thereby other places, can look already through
internal global and stack memory. This patch enables them to look
through heap memory returned by functions with a `noalias` return.

In the future we can look through `noalias` arguments as well but that
will require AAIsDead to learn that such memory can be inspected by the
caller later on. We also need teach AAPointerInfo about dominance to
actually deal with memory that might not be `null` or `undef`
initialized. D106397 is a first step in that direction already.

Reviewed By: kuter

Differential Revision: https://reviews.llvm.org/D109170
2021-12-29 00:21:36 -06:00
Johannes Doerfert e05940de2a [Attributor][FIX] Recursion via memory needs to be tracked explicitly
Recursion can happen when we see a PHI use the second time or when we
look at a store value operand use again. We already visited the
potential copies and doing so again will just cause endless looping.

Reviewed By: kuter

Differential Revision: https://reviews.llvm.org/D108190
2021-08-27 13:12:13 -05:00
Johannes Doerfert 5f543919b2 [Attributor][FIX] Guard constant casts with type size checks 2021-08-12 10:39:53 -05:00
Johannes Doerfert e7e3585cde [Attributor][FIX] Handle recurrences (PHIs) in AAPointerInfo explicitly
PHI nodes are not pass through but change their value, we have to
account for that to avoid missing stores.

Follow up for D107798 to fix PR51249 for good.

Differential Revision: https://reviews.llvm.org/D107808
2021-08-11 00:49:54 -05:00
Johannes Doerfert 96da6dd6ba [Attributor][FIX] Only avoid visiting PHI uses multiple times (PR51249)
AAPointerInfoFloating needs to visit all uses and some multiple times if
we go through PHI nodes. Attributor::checkForAllUses keeps a visited set
so we don't recurs endlessly. We now allow recursion for non-phi uses so
we track all pointer offsets via PHI nodes properly without endless
recursion.

This replaces the first attempt D107579.

Differential Revision: https://reviews.llvm.org/D107798
2021-08-11 00:49:54 -05:00
Johannes Doerfert f358727ce0 [Attributor][NFC] Precommit reproducer for PR51249
The bulk of the changes come from attributes but only the @phi_store
function is effectively added.
2021-08-11 00:49:54 -05:00
Johannes Doerfert c55e18824d [Attributor][FIX] Copy all members in the assignment operator
Also improve debug output slightly.
2021-07-27 01:44:13 -05:00
Johannes Doerfert d4bfce5521 [Attributor] Utilize the InstSimplify interface to simplify instructions
When we simplify at least one operand in the Attributor simplification
we can use the InstSimplify to work on the simplified operands. This
allows us to avoid duplication of the logic.

Depends on D106189

Differential Revision: https://reviews.llvm.org/D106190
2021-07-27 00:56:23 -05:00
Johannes Doerfert 41bd26dff9 [Attributor] Delete dead stores
D106185 allows us to determine if a store is needed easily. Using that
knowledge we can start to delete dead stores.

In AAIsDead we now track more state as an instruction can be dead (= the
old optimisitc state) or just "removable". A store instruction can be
removable while being very much alive, e.g., if it stores a constant
into an alloca or internal global. If we would pretend it was dead
instead of only removablewe we would ignore it when we determine what
values a load can see, so that is not what we want.

Differential Revision: https://reviews.llvm.org/D106188
2021-07-26 23:33:36 -05:00
Johannes Doerfert 5fbb51d8d5 [Attributor] Extend the AAValueSimplify compare simplification logic
We first simplify the operands of a compare and then reason on the
simplified versions, e.g., with AANonNull.

This does improve the simplification capabilities but also fixes a
potential problem that has not yet been observed by simplifying the
operands first.
2021-07-20 00:35:14 -05:00
Johannes Doerfert c2281f1565 [Attributor] Introduce AAPointerInfo
This patch introduces AAPointerInfo which tracks the uses of a pointer
and places them in "bins" based on their offset from the base and access
size.

As with other AAs, any pointer can be tracked but it is up to the user
to make sense of the results. The user in this patch is AAValueSimplify
and AAPotentialValues which both utilize AAPointerInfo to determine the
value of a load. For now, this is restricted to loads of allocas and
internal globals. Through the use of AAPointerInfo and the "bins" we can
track struct members separately. The users also know that storing only
zeros (at unknown indices) will result in loading only 0 (from unknown
indices). Other than that, the users are flow and context insensitive
(for now).

To deal with the "bins" more easily, AAPointerInfo provides a
forallInterfearingAccesses that applies a callback on all accesses
that might interfere with a given load or store.

Differential Revision: https://reviews.llvm.org/D104432
2021-07-19 22:48:35 -05:00
Johannes Doerfert 28c78a9e12 [Attributor] Simplify loads
As a first step to simplify loads we only handle `null` and `undef`
underlying objects, as well as objects that have the load as a single user.
Loads of those values can be replaced by the initializer, if any.
Proper reasoning is introduced in a follow up patch

Differential Revision: https://reviews.llvm.org/D103862
2021-07-19 22:47:29 -05:00
Johannes Doerfert fc82409b5c [Attributor] Simplify operands inside of simplification AAs first
When we do simplification via AAPotentialValues or AAValueConstantRange
we need to simplify the operands of an instruction we deconstruct first.
This does not only improve the result, see for example range.ll, but is
required as we allow outside AAs to provide simplification rules via
callbacks. If we do ignore the simplification rules and base other
simplifications on the IR instead we can create an inconsistent state.
2021-07-06 22:41:18 -05:00
Johannes Doerfert 39e1876b06 [Attributor][NFC] Precommit a set of test cases for load simplification 2021-06-18 01:07:51 -05:00