GCC warning:
```
/llvm-project/clang/lib/Frontend/TestModuleFileExtension.cpp:131:20: warning: ‘llvm::raw_ostream& clang::operator<<(llvm::raw_ostream&, const clang::TestModuleFileExtension&)’ has not been declared within ‘clang’
131 | llvm::raw_ostream &clang::operator<<(llvm::raw_ostream &OS,
| ^~~~~
In file included from /llvm-project/clang/lib/Frontend/TestModuleFileExtension.cpp:8:
/llvm-project/clang/lib/Frontend/TestModuleFileExtension.h:75:3: note: only here as a ‘friend’
75 | operator<<(llvm::raw_ostream &OS, const TestModuleFileExtension &Extension);
| ^~~~~~~~
```
Add mimgopc object to represent the opcode allowing different
opcodes for different hardware variants.
This enables image_atomic_fcmpswap, image_atomic_fmin, and
image_atomic_fmax on GFX10
Reviewed By: foad, rampitec
Differential Revision: https://reviews.llvm.org/D96309
Move SimplifiyVisitor from Simplify.h to Simplify.cpp. It is not
relevant for applying the pass in either the NewPM or the legacyPM.
Rename it to SimplifyImpl to account for that.
This is possible due its state not being necessary to be preserved
between runs and thefore SimplifyImpl not needed to be held in the
pass object. Instead, SimplifyImpl is only instatiated for the
current Scop. In the NewPM as a function-local variable, and in the
legacy PM inside a llvm::Optional object because the state must be
preserved between the printScop (invoked by opt -analyze) and the most
recent runOnScop calls.
This removes the commuted PatFrags that only existed to carry
an SDNodeXForm in its OperandTransform field. We know all the places
that need to use the commuted SDNodeXForm and there is one transform
shared by signed and unsigned compares. So just hardcode the
the SDNodeXForm where it is needed and use the non commuted PatFrag
in the pattern.
I think when I wrote this I thought the SDNodeXForm name had to
match what is in the PatFrag that is being used. But that's not
true. The OperandTransform is only used when the PatFrag is used
in an instruction pattern and not a separate Pat pattern. All
the commuted cases are Pat patterns.
"using namespace" pollutes the namespace of every file that includes
such a header and universally considered a bad thing. Even the variant
namespace polly {
using namespace llvm;
}
(previously used by LoopGenerators.h) imports more symbols than the file
is in control of. The header may include a fixed set of files from LLVM,
but the header itself may by be included together with other headers
from LLVM. For instance, LLVM's MemorySSA.h and Polly's ScopInfo.h both
declare a class 'MemoryAccess' which may conflict.
Instead of prefixing everything in Polly's header files, this patch adds
'using' statements to import only the symbols that are actually
referenced in Polly. This approach is also used by MLIR to import
commonly used symbols into the mlir namespace.
This patch also puts the symbols declared in IslNodeBuilder.h into the
Polly namespace to also be able to use the imported symbols.
Updates static analyzer to be able to generate both sarif and html
output in a single run similar to plist-html.
Differential Revision: https://reviews.llvm.org/D96389
This patch is a follow up of D96422 and move the ShapeShiftType to
TableGen.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D96442
When accessing a specific procedure of a USE-associated generic
interface, we need to allow for the case in which that specific
procedure has the same name as the generic when testing for
its availability in the current scope.
Differential Revision: https://reviews.llvm.org/D96467
https://reviews.llvm.org/D95835 implements origin tracking for DFSan.
It reuses the chained origin depot of MSan.
This change moves the utility to sanitizer_common to share between
MSan and DFSan.
Reviewed-by: eugenis, morehouse
Differential Revision: https://reviews.llvm.org/D96319
Some state in name resolution is stored in the DeclarationVisitor
instance and processed at the end of the specification part.
This state needs to accommodate nested specification parts, namely
the ones that can be nested in a subroutine or function interface
body.
Differential Revision: https://reviews.llvm.org/D96466
An error has occurred when I build libunwind with -DLLVM_BUILD_DOCS=ON.
Reviewed By: #libunwind, compnerd
Differential Revision: https://reviews.llvm.org/D96107
The CMake changes in 2aa1af9b1d to make it possible to build MLIR as a
standalone project unfortunately disabled all unit-tests from the
regular in-tree build.
Rename the `RF_MoveDistinctMDs` flag passed into `MapValue` and
`MapMetadata` to `RF_ReuseAndMutateDistinctMDs` in order to more
precisely describe its effect and clarify the header documentation.
Found this while helping to investigate PR48841, which pointed out an
unsound use of the flag in `CloneModule()`. For now I've just added a
FIXME there, but I'm hopeful that the new (more precise) name will
prevent other similar errors.
This is the first patch of a serie to move FIR types to TableGen format as suggested in D96172.
This patch is setting up the files for FIR types and move the ShapeType to TableGen.
As discussed with @schweitz, I'm taking over this task to help the FIR upstreaming effort.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D96422
This test started failing after https://reviews.llvm.org/D95849
defaulted --allow-unused-prefixes to false.
Taking a look at the test, I didn't see an obvious need to add
OS-specific check lines for each supported value of %os.
rdar://74207657
A G_MUL + G_PTR_ADD can also be folded into a madd. So, conservatively, we
shouldn't combine when the G_MUL is used by a G_PTR_ADD either.
Differential Revision: https://reviews.llvm.org/D96457
After discussion, it seems like we want to go with
"inherent/discardable". These seem to best capture the relationship with
the op semantics and don't conflict with other terms.
Please let me know your preferences. Some of the other contenders are:
```
"intrinsic" side | "annotation" side
-----------------+------------------
characteristic | annotation
closed | open
definitional | advisory
essential | discardable
expected | unexpected
innate | acquired
internal | external
intrinsic | extrinsic
known | unknown
local | global
native | foreign
inherent | acquired
```
Rationale:
- discardable: good. discourages use for stable data.
- inherent: good
- annotation: redundant and doesn't convey difference
- intrinsic: confusable with "compiler intrinsics".
- definitional: too much of a mounthful
- extrinsic: too exotic of a word and hard to say
- acquired: doesn't convey the relationship to the semantics
- internal/external: not immediately obvious: what is internal to what?
- innate: similar to intrinsic but worse
- acquired: we don't typically think of an op as "acquiring" things
- known/unknown: by who?
- local/global: to what?
- native/foreign: to where?
- advisory: confusing distinction: is the attribute itself advisory or
is the information it provides advisory?
- essential: an intrinsic attribute need not be present.
- expected: same issue as essential
- unexpected: by who/what?
- closed/open: whether the set is open or closed doesn't seem essential
to the attribute being intrinsic. Also, in theory an op can have an
unbounded set of intrinsic attributes (e.g. `arg<N>` for func).
- characteristic: unless you have a math background this probably
doesn't make as much sense
Differential Revision: https://reviews.llvm.org/D96093
GlobalISel was only doing this with minsize. SDAG does this with optsize.
(See: `SelectionDAG::shouldOptForSize()`)
This is a 0.3% code size improvement for CTMark at -Os.
(Best: 1.1% improvements on lencod + pairlocalalign)
Differential Revision: https://reviews.llvm.org/D96451
While learning about ThreadPlan, I did a bit of cleanup:
* Remove unused code
* Move functions to protected where applicable
* Remove virtual for functions that are not overridden
Differential Revision: https://reviews.llvm.org/D96277
Break SampleProfileLoader into to a base and a derived class.
Base class (SampleProfileLoaderBaseImpl) includes the common
code for IR and MachineIR (CodeGen) sample loader.
It will be templatelized in the later patch.
Inline and Probe related code will remain in the derived class of
SampleProfileLoader and stays in SampleProfile.cpp.
We need to refactor some functions:
(1) getInstWeight() to enable the code sharing -- put the core into
getInstWeightImpl().
(2) emitAnnotation() and propagateWeights() to carve out the code
specific to SampleProfileLoader.
(3) make getInstWeight() and findFunctionSamples() virtual and override
in SampleProfileLoader as they need to access the fields in the derived
class.
Differential Revision: https://reviews.llvm.org/D95832
When we have a G_ADD which is fed by a G_ICMP on one side, we can fold it into
the cset for the G_ICMP.
e.g. Given
```
%cmp = G_ICMP ... %x, %y
%add = G_ADD %cmp, %z
```
We would normally emit a cmp, cset, and add.
However, `%add` is either `%z` or `%z + 1`. So, we can just use `%z` as the
source of the cset rather than wzr, saving an instruction.
This would probably be cleaner in AArch64PostLegalizerLowering, but we'd need
to change the way we represent G_ICMP to do that, I think. For now, it's
easiest to implement in selection.
This is a 0.1% code size improvement on CTMark/pairlocalalign at -Os.
Example: https://godbolt.org/z/7KdrP8
Differential Revision: https://reviews.llvm.org/D96388
This is obsoleted by the standard semanticTokens request family.
As well as the protocol details, this allows us to remove a bunch of plumbing
around pushing highlights to clients.
This should not land until the new protocol has feature parity, see D77702.
Differential Revision: https://reviews.llvm.org/D95576
If context is enabled/disabled and queried concurrently then this
results in a data race/TSAN failure with RunSafely (where boolean
variable was not locked).
There doesn't seem to be a reasonable way to enable threads that enable
and disable recovery in parallel (without also keeping
gCrashRecoveryEnabled's lock held during Fn execution which seems
undesirable). This makes enable checking if enabled thread local and
consistent with other thread local usage of crash context here.
Differential Revision: https://reviews.llvm.org/D93907
The IR/MIR pseudo probe intrinsics don't get materialized into real machine instructions and therefore they don't incur runtime cost directly. However, they come with indirect cost by blocking certain optimizations. Some of the blocking are intentional (such as blocking code merge) for better counts quality while the others are accidental. This change unblocks perf-critical optimizations that do not affect counts quality. They include:
1. IR InstCombine, sinking load operation to shorten lifetimes.
2. MIR LiveRangeShrink, similar to #1
3. MIR TwoAddressInstructionPass, i.e, opeq transform
4. MIR function argument copy elision
5. IR stack protection. (though not perf-critical but nice to have).
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D95982
Not using builtins doesn't always imply worse code,
but for e. g. isinf, this is 30%+ faster.
Before:
name time/op
BM_isinf 2.14ns ± 2%
After:
name time/op
BM_isinf 1.33ns ± 2%
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D88854