Summary:
There was even a TODO for this.
The main motivation is to make use of call-site based
`__attribute__((alloc_align(param_idx)))` validation (D72996).
Reviewers: rsmith, erichkeane, aaron.ballman, jdoerfert
Reviewed By: rsmith
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73020
If we deduplicate OpenMP runtime calls we have multiple `ident_t*` that
represent information like source location. So far, we simply kept the
one used by the replacement call. However, as exposed by PR44893, that
can cause problems if we have stack allocated `ident_t` objects. While
we need to revisit the use of these as well, it is clear that we
eventually want to merge source location information in some way. With
this patch we add the infrastructure to do so but without doing the
actual merge. Instead we pick a global `ident_t` from the replaced
calls, if possible, or create a new one with an unknown location
instead.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D74925
Summary:
Removes patterns that were not doing useful work, changes the
default extract instructions to be the unsigned versions now that
they are enabled by default, fixes PR44988, and adds tests for
sext_inreg lowering.
Reviewers: aheejin
Reviewed By: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75005
Summary:
This patch adds `getAssociatedRange` which, for a given decl, computes preceding
and trailing text that would conceptually be associated with the decl by the
reader. This includes comments, whitespace, and separators like ';'.
Reviewers: gribozavr
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72153
body
We started seeing cases where ARC optimizer would move retain calls into
loop bodies, causing imbalance in the number of retain and release
calls, after changes were made to delete inert ARC calls since the inert
calls that used to block code motion are gone.
Fix the bug by setting the CFG hazard flag when visiting a loop header.
rdar://problem/56908836
Summary:
Instead of using global variables with unpredicted time of
deinitialization, use dynamically allocated variables with functions
explicitly marked as global constructor/destructor and priority. This
allows to prevent the crash because of the incorrect order of dynamic
libraries deinitialization.
Reviewers: grokos, hfinkel
Subscribers: caomhin, kkwli0, openmp-commits
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D74837
Verifies that an argument passed to __builtin_frame_address or __builtin_return_address is within the range [0, 0xFFFF]
Differential revision: https://reviews.llvm.org/D66839
Re-committed after fixed: c93112dc4f
Summary:
Replacing uses of IV outside of the loop is likely generally useful,
but `rewriteLoopExitValues()` is cautious, and if it isn't told to always
perform the replacement, and there are hard uses of IV in loop,
it doesn't replace.
In [[ https://bugs.llvm.org/show_bug.cgi?id=44668 | PR44668 ]],
that prevents `-indvars` from replacing uses of induction variable
after the loop, which might be one of the optimization failures
preventing that code from being vectorized.
Instead, now that the cost model is fixed, i believe we should be
a little bit more optimistic, and also perform replacement
if we believe it is within our budget.
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=44668 | PR44668 ]].
Reviewers: reames, mkazantsev, asbirlea, fhahn, skatkov
Reviewed By: mkazantsev
Subscribers: nikic, hiraditya, zzheng, javed.absar, dmgreen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73501
Summary:
Previosly we simply always said that `SCEVMinMaxExpr` is too costly to expand.
But this isn't really true, it expands into just a comparison+swap pair.
And again much like with add/mul, there will be one less such pair
than the number of operands. And we need to count the cost of operands themselves.
This does change a number of testcases, and as far as i can tell,
all of these changes are improvements, in the sense that
we fixed up more latches to do the [in]equality comparison.
This concludes cost-modelling changes, no other SCEV expressions exist as of now.
This is a part of addressing [[ https://bugs.llvm.org/show_bug.cgi?id=44668 | PR44668 ]].
Reviewers: reames, mkazantsev, wmi, sanjoy
Reviewed By: mkazantsev
Subscribers: hiraditya, javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73744
Summary:
So, i wouldn't call this *obviously* correct,
but i think i got it right this time :)
Roughly, we have
```
Op0*x^0 + Op1*x^1 + Op2*x^2 ...
```
where `Op_{n} * x^{n}` is called term, and `n` the degree of term.
Due to the way they are stored internally in `SCEVAddRecExpr`,
i believe we can have `Op_{n}` to be `0`, so we should not charge for those.
I think it is most straight-forward to count the cost in 4 steps:
1. First, count it the same way we counted `scAddExpr`, but be sure to skip terms with zero constants.
Much like with `add` expr we will have one less addition than number of terms.
2. Each non-constant term (term degree >= 1) requires a multiplication between the `Op_{n}` and `x^{n}`.
But again, only charge for it if it is required - `Op_{n}` must not be 0 (no term) or 1 (no multiplication needed),
and obviously don't charge constant terms (`x^0 == 1`).
3. We must charge for all the `x^0`..`x^{poly_degree}` themselves.
Since `x^{poly_degree}` is `x * x * ... * x`, i.e. `poly_degree` `x`'es multiplied,
for final `poly_degree` term we again require `poly_degree-1` multiplications.
Note that all the `x^{0}`..`x^{poly_degree-1}` will be computed for the free along the way there.
4. And finally, the operands themselves.
Here, much like with add/mul exprs, we really don't look for preexisting instructions..
Reviewers: reames, mkazantsev, wmi, sanjoy
Reviewed By: mkazantsev
Subscribers: hiraditya, javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73741
Summary:
While this resolves the regression from D73722 in `llvm/test/Transforms/IndVarSimplify/exit_value_test2.ll`,
this now regresses `llvm/test/Transforms/IndVarSimplify/elim-extend.ll` `@nestedIV` test,
we no longer can perform that expansion within default budget of `4`, but require budget of `6`.
That regression is being addressed by D73777.
The basic idea here is simple.
```
Op0, Op1, Op2 ...
| | |
\--+--/ |
| |
\---+---/
```
I.e. given N operands, we will have N-1 operations,
so we have to add cost of an add (mul) for **every** Op processed,
**except** the first one, plus we need to recurse into *every* Op.
I'm guessing there's already canonicalization that ensures we won't
have `1` operand in `scMulExpr`, and no `0` in `scAddExpr`/`scMulExpr`.
Reviewers: reames, mkazantsev, wmi, sanjoy
Reviewed By: mkazantsev
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73728
Summary:
If we don't believe this UDiv is actually a LShr in disguise, things are much worse.
First, we try to see if this UDiv actually originates from user code,
by looking for `S + 1`, and if found considering this UDiv to be free.
But otherwise, we always considered this UDiv to be high-cost.
However that is no longer the case with TTI-driven cost model:
our default budget is 4, which matches the default cost of UDiv,
so now we allow a single UDiv to not be counted as high-cost.
While that is the case, it is evident this is actually a regression
due to the fact that cost-modelling is incomplete - we did not account
for the `add`, `mul` costs yet. That is being addressed in D73728.
Cost-modelling for UDiv also seems pretty straight-forward:
subtract cost of the UDiv itself, and recurse into both the LHS and RHS.
Reviewers: reames, mkazantsev, wmi, sanjoy
Reviewed By: mkazantsev
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73722
Summary:
Like with casts, we need to subtract the cost of `lshr` instruction
from budget, and recurse into LHS operand.
Seems "pretty obviously correct" to me?
To be noted, there is a number of other shortcuts we //could// cost-model:
* `... + (-1 * ...)` -> `... - ...` <- likely very frequent case
* `x - (rem x, power-of-2)`, which is currently `(x udiv power-of-2) * power-of-2` -> `x & -log2(power-of-2)`
* `rem x, power-of-2`, which is currently `x - ((x udiv power-of-2) * power-of-2)` -> `x & log2(power-of-2)-1`
* `... * power-of-2` -> `... << log2(power-of-2)` <- likely not very beneficial
Reviewers: reames, mkazantsev, wmi, sanjoy
Reviewed By: mkazantsev
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73718
Summary:
This is not a NFC, although it does not change any of the existing tests.
I'm not really sure if we should have specific tests for the cost modelling itself.
This is the first patch that actually makes `SCEVExpander::isHighCostExpansionHelper()`
account for the cost of the SCEV expression, and consider the budget available,
by modelling cast expressions.
I believe the logic itself is "pretty obviously correct" - from budget,
we need to subtract the cost of the cast expression from inner type `Op->getType()`
to the `S->getType()` type, and recurse into the expression we are casting.
Reviewers: reames, mkazantsev, wmi, sanjoy
Reviewed By: mkazantsev
Subscribers: xbolva00, hiraditya, javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73716
Summary:
Currently, as per `check-llvm`, we never call `SCEVExpander::isHighCostExpansion()` with null TTI,
so this appears to be a safe restriction.
Reviewers: reames, mkazantsev, wmi, sanjoy
Reviewed By: mkazantsev
Subscribers: javed.absar, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73712
Summary:
As far as i can tell this is still NFC.
Initially in rL146438 it was added at the top of the function,
later rL238507 dethroned it, and rL244474 did it again.
I'm not sure if we have already checked the cost of this expansion, we should be doing that again.
Reviewers: reames, mkazantsev, wmi, sanjoy, atrick, igor-laevsky
Reviewed By: mkazantsev
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73706
Summary:
In future patches`SCEVExpander::isHighCostExpansionHelper()` will respect the budget allocated by performing TTI cost modelling.
This is a fully NFC patch to make things reviewable.
Reviewers: reames, mkazantsev, wmi, sanjoy
Reviewed By: mkazantsev
Subscribers: hiraditya, zzheng, javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73705
Summary:
Future patches will make use of TTI to perform cost-model-driven `SCEVExpander::isHighCostExpansionHelper()`
This is a fully NFC patch to make things reviewable.
Reviewers: reames, mkazantsev, wmi, sanjoy
Reviewed By: mkazantsev
Subscribers: hiraditya, zzheng, javed.absar, dmgreen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73704
Move the implementation of __libcpp_thread_poll_with_backoff
and __libcpp_timed_backoff_policy::operator() out of the
_LIBCPP_HAS_THREAD_API_PTHREAD block. None of the code in these
methods is pthreads specific.
Also add "inline _LIBCPP_INLINE_VISIBILITY" to
__libcpp_timed_backoff_policy::operator(), to avoid errors due to
multiple definitions of the operator. Contrary to
__libcpp_thread_poll_with_backoff (which is a template function),
this is a normal non-templated method.
Differential Revision: https://reviews.llvm.org/D75102
The debugserver profile thread used to suspend itself between samples with
a usleep. When you detach or kill, MachProcess::Clear would delay replying
to the incoming packet until pthread_join of the profile thread returned.
If you are unlucky or the suspend delay is long, it could take longer than
the packet timeout for pthread_join to return. Then you would get an error
about detach not succeeding from lldb - even though in fact the detach was
successful...
I replaced the usleep with PThreadEvents entity. Then we just call a timed
WaitForEventBits, and when debugserver wants to stop the profile thread, it
can set the event bit, and the sleep will exit immediately.
Differential Revision: https://reviews.llvm.org/D75004
Summary:
Implement the DWARF register mapping described in
llvm/docs/AMDGPUUsage.rst
This is currently limited to wave64 VGPRs/AGPRs.
This also includes some minor changes in AMDGPUInstPrinter,
AMDGPUMCTargetDesc, and AMDGPUAsmParser to make generating CFI assembly
text and ELF sections possible to ease testing, although complete CFI
support is not yet implemented.
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74915
Add a dump method that recursively prints an instruction and all
the instructions defining its operands and so on.
This is helpful when looking at combiner issue.
NFC
Differential Revision: https://reviews.llvm.org/D75094
This reverts commit 8d22100f66.
There was a functional regression reported (https://bugs.llvm.org/show_bug.cgi?id=44996). I'm not actually sure the patch is wrong, but I don't have time to investigate currently, and this line of work isn't something I'm likely to get back to quickly.
Summary:
In libc++, we normally #ifdef out header content instead of #erroring
out when the Standard in use is insufficient for the requirements of
the header.
Reviewers: EricWF
Subscribers: jkorous, dexonsmith, libcxx-commits, teemperor
Tags: #libc
Differential Revision: https://reviews.llvm.org/D75074
This patch fixes PR44896. For IR input files, option fdiscard-value-names
should be ignored as we need named values in loadModule().
Commit 60d3947922 sets this option after loadModule() where valued names
already created. This creates an inconsistent state in setNameImpl()
that leads to a seg fault.
This patch forces fdiscard-value-names to be false for IR input files.
This patch also emits a warning of "ignoring -fdiscard-value-names" if
option fdiscard-value-names is explictly enabled in the commandline for
IR input files.
Differential Revision: https://reviews.llvm.org/D74878
Compute and propagate conversion kind to diagnostics helper in C++
to provide more specific diagnostics about incorrect implicit
conversions in assignments, initializations, params, etc...
Duplicated some diagnostics as errors because C++ is more strict.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74116
Summary:
This fixes a clangd rename issue, which is missing the reference of
an incomplete specialization.
Unfortunately, I didn't reproduce this issue in clang-rename, I guess
the input `FoundDecl` of AdditionalUSRFinder is different in clangd vs
clang-rename, clang-rename uses the underlying CXXRecordDecl of the
ClassTemplateDecl, which is fixed in 5d862c042b;
while clangd-rename uses the ClassTemplateDecl.
Reviewers: kbobyrev
Reviewed By: kbobyrev
Subscribers: ilya-biryukov, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74829
So far we've been dropping coverage every time we've encountered
a CXXInheritedCtorInitExpr. This patch attempts to add some
initial support for it.
Constructors for arguments of a CXXInheritedCtorInitExpr are still
not fully supported.
Differential Revision: https://reviews.llvm.org/D74735