This patch adds corresponding memory effects to mlir llvm-dialect load/store/addressof ops, which thus enables canonicalizations of those ops (like dead code elimination) that rely on the effect interface
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D117041
ROCm 4.5 device library introduced __ockl_dm_alloc and __ockl_dm_dealloc
for supporting device side malloc/free.
This patch redefines device malloc/free to use these functions.
It also fixes a bug in the wrapper header which incorrectly defines free
with return type void* instead of void.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D116967
In D115311, we're looking to modify clang to emit i constraints rather
than X constraints for callbr's indirect destinations. Prior to doing
so, update all of the existing tests in llvm/ to match.
Reviewed By: void, jyknight
Differential Revision: https://reviews.llvm.org/D115410
when generating copy/dispose helper functions
Analyze the block captures just once before generating copy/dispose
block helper functions and honor the inert `__unsafe_unretained`
qualifier. This refactor fixes a bug where captures of ObjC
`__unsafe_unretained` and class types were needlessly retained/released
by the copy/dispose helper functions.
Differential Revision: https://reviews.llvm.org/D116948
Several of the comments were annotating the wrong argument.
I caught this while reviewing this clean-up: 8afcfbfb8f
which was changing booleans to use true and false and in the this case the comment and the type looked mismatched.
Differential Revision: https://reviews.llvm.org/D116982
This patch fixes:
mlir/lib/Dialect/Tosa/Transforms/TosaOptionalDecompositions.cpp:28:8:
error: 'runOnFunction' overrides a member function but is not marked
'override' [-Werror,-Wsuggest-override]
Completely rework how we handle X constrained labels for inline asm.
X should really be treated as i. Then existing tests can be moved to use
i D115410 and clang can just emit i D115311. (D115410 and D115311 are
callbr, but this can be done for label inputs, too).
Coincidentally, this simplification solves an ICE uncovered by D87279
based on assumptions made during D69868.
This is the third approach considered. See also discussions v1 (D114895)
and v2 (D115409).
Reported-by: kernel test robot <lkp@intel.com>
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1512
Reviewed By: void, jyknight
Differential Revision: https://reviews.llvm.org/D115688
Moved all TOSA decomposition patterns so that they can be optionally populated
and used by external rewrites. This avoids decomposing TOSa operations when
backends may benefit from the non-decomposed version.
Reviewed By: rsuderman, mehdi_amini
Differential Revision: https://reviews.llvm.org/D116526
In D116873 I did this for libunwind prior to defining a new install path
variable. But I think the change is good on its own, and libc++{,abi}
could also use it.
libc++ needed the base header var defined above the conditional part to
use it for the prefi+ed headers in the non-target-specific case. For
consistency, I therefore put the unconditional ones above for all 3
libs, which is why I touched the libunwind code (seeing that it had the
core change already)
Reviewed By: phosek, #libunwind, #libc, #libc_abi, ldionne
Differential Revision: https://reviews.llvm.org/D116988
As pointed out in https://reviews.llvm.org/D115688#inline-1108193, we
don't want to sink the save point past an INLINEASM_BR, otherwise
prologepilog may incorrectly sink a prolog past the MBB containing an
INLINEASM_BR and into the wrong MBB.
ShrinkWrap is getting this wrong because LR is not in the list of callee
saved registers. Specifically, ShrinkWrap::useOrDefCSROrFI calls
RegisterClassInfo::getLastCalleeSavedAlias which reads
CalleeSavedAliases which was populated by
RegisterClassInfo::runOnMachineFunction by iterating the list of
MCPhysReg returned from MachineRegisterInfo::getCalleeSavedRegs.
Because PPC's LR is non-allocatable, it's NOT considered callee saved.
Add an interface to TargetRegisterInfo for such a case and use it in
Shrinkwrap to ensure we don't sink a prolog past an INLINEASM or
INLINEASM_BR that clobbers LR.
Reviewed By: jyknight, efriedma, nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D116424
Apply scale may operate on vectors, scalars, or tensors during
tiling. Relax the requirements to avoid failures.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D116981
Given an if of the form, simplify it by eliminating the not and swapping the regions
scf.if not(c) {
yield origTrue
} else {
yield origFalse
}
becomes
scf.if c {
yield origFalse
} else {
yield origTrue
}
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D116990
Currently the way some relocation-related static functions pass around
states is clumsy. Add a Resolver class to store some states as member
variables.
Advantages:
* Avoid the parameter `InputSectionBase &sec` (this offsets the cost passing around `this` paramemter)
* Avoid the parameter `end` (Mips and PowerPC hacks)
* `config` and `target` can be cached as member variables to reduce global state accesses. (potential speedup because the compiler didn't know `config`/`target` were not changed across function calls)
* If we ever want to reduce if-else costs (e.g. `config->emachine==EM_MIPS` for non-Mips) or introduce parallel relocation scan not handling some tricky arches (PPC/Mips), we can templatize Resolver
`target` isn't used as much as `config`, so I change it to a const reference
during the migration.
There is a minor performance inprovement for elf::scanRelocations.
Reviewed By: ikudrin, peter.smith
Differential Revision: https://reviews.llvm.org/D116881
All named ops list iterators for accessing output first except
pooling ops. This commit made the pooling ops consistent with
the rest.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D115520
Alternative to D116817.
This introduces a new value-based folding interface for Or (FoldOr),
which takes 2 values and returns an existing Value or a constant if the
Or can be simplified. Otherwise nullptr is returned. This replaces the
more restrictive CreateOr which takes 2 constants.
This is the used to implement a folder that uses InstructionSimplify.
The logic to simplify `Or` instructions is moved there. Subsequent
patches are going to transition other CreateXXX to the more general
FoldXXX interface.
Reviewed By: nikic, lebedev.ri
Differential Revision: https://reviews.llvm.org/D116935
Fixes https://github.com/llvm/llvm-project/issues/52976.
- Make no formatting for macros
- Attach comment with definition headers
- Make no change on use of empty lines at block start/end
- Fix misrecognition of keyword namespace
Differential Revision: https://reviews.llvm.org/D116663
Reviewed By: MyDeveloperDay, HazardyKnusperkeks, curdeius
This is a re-commit of e2c7ee0743 which
was reverted in a2a58d91e8 and
ea81cea816. This includes a fix to
consistently check for EFLAGS being live-out. See phabricator
review.
Original Summary:
This extends `optimizeCompareInstr` to re-use previous comparison
results if the previous comparison was with an immediate that was 1
bigger or smaller. Example:
CMP x, 13
...
CMP x, 12 ; can be removed if we change the SETg
SETg ... ; x > 12 changed to `SETge` (x >= 13) removing CMP
Motivation: This often happens because SelectionDAG canonicalization
tends to add/subtract 1 often when optimizing for fallthrough blocks.
Example for `x > C` the fallthrough optimization switches true/false
blocks with `!(x > C)` --> `x <= C` and canonicalization turns this into
`x < C + 1`.
Differential Revision: https://reviews.llvm.org/D110867
Similar for ceil, trunc, round, and roundeven. This allows us to use
static rounding modes to avoid a libcall.
This optimization is done for AArch64 as isel patterns.
RISCV doesn't have instructions for ceil/floor/trunc/round/roundeven
so the operations don't stick around until isel to enable a pattern
match. Thus I've implemented a DAG combine.
We only handle XLen types except i32 on RV64. i32 will be type
legalized to a RISCVISD node. All other types will be type legalized
to XLen and maintain the FP_TO_SINT/UINT ISD opcode.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D116771
When spirv-link is found, it won't match a leading `"`. This fixes
the test added by commit dbb8d08637 ("[SPIR-V] Add linking using
spirv-link.", 2022-01-11).
In D116750, the `clangFrontend` library was added as a dependency of `LexTests` in order to make `clang::ApplyHeaderSearchOptions()` available. This increased the number of TUs the test depends on.
This patch moves the function into `clangLex` and removes dependency of `LexTests` on `clangFrontend`.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D117024
Specifically, mmap and munmap have been moved to the default build list
of entrypoints. To support this, certain deps and includes have been
adjusted. The use of errno in some cases has been updated.
Extend the existing malloc-family specific optimization to all noalias calls. This allows us to handle allocation wrappers, and removes a dependency on a lib-func check in favor of generic attribute usage.
Differential Revision: https://reviews.llvm.org/D116980
We would like to nominate Andy Kaylor and Sergey Maslov to join the LLVM security group as a representative of Intel. Both are members of the Intel compiler team, and would like to register as vendor contacts. Intel packages and distributes LLVM-based toolchains as part of our compiler products. As such, we would like to be aware of any security vulnerability found in the compiler, and would like to contribute to the resolution of such issues.
Please let us know if anything is missing from the nomination.
Reviewed By: apilipenko, dim, george.burgess.iv, kristof.beyls, mattdr, nikhgupt, probinson, peter.smith, pietroalbini, steveklabnik
Differential Revision: https://reviews.llvm.org/D115657
D92270 updated constant expression folding to fold inbounds GEP to
poison if the base is undef. Apply the same logic to SimplifyGEPInst.
The justification is that we can choose an out-of-bounds pointer as base
pointer.
Reviewed By: nikic, lebedev.ri
Differential Revision: https://reviews.llvm.org/D117015
We could just merge all umin into umin_seq, but that is likely
a pessimization, so don't do that, but pretend that we did
for the purpose of deduplication.
analyzeGlobal() looks through non-constexpr cast instructions when
looking for users. However, this particular place only strips the
casts again if they are constexprs. We should be looking through all
casts here.
The elements of `SearchPath::SearchDirs` are being referenced to by their indices. This proved to be error-prone: `HeaderSearch::SearchDirToHSEntry` was accidentally not being updated in `HeaderSearch::AddSearchPath()`. This patch fixes that by referencing `SearchPath::SearchDirs` elements by their address instead, which is stable thanks to the bump-ptr-allocation strategy.
Reviewed By: ahoppen
Differential Revision: https://reviews.llvm.org/D116750