MSVC needs to know where to put the archive (.lib) as well as the runtime
(.dll). If left to the default location, multiple rules to generate the same
file will be produced, creating a Ninja error.
Differential Revision: https://reviews.llvm.org/D108181
Address post follow up comment in D108016. Avoid creating isec for
LLVM segments since we are skipping over it.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D108167
Suffix opcodes with _gfx10.
Remove direct references to architecture specific opcodes.
Add a BVH flag and apply this to diassembly.
Fix a number of disassembly errors on gfx90a target caused by
previous incorrect BVH detection code.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D108117
* Rename ids to values in FlatAffineValueConstraints.
* Overall cleanup of comments in FlatAffineConstraints and FlatAffineValueConstraints.
Differential Revision: https://reviews.llvm.org/D107947
* Extract "value" functionality of `FlatAffineConstraints` into a new derived `FlatAffineValueConstraints` class. Current users of `FlatAffineConstraints` can use `FlatAffineValueConstraints` without additional code changes, thus NFC.
* `FlatAffineConstraints` no longer associates dimensions with SSA Values. All functionality that requires this, is moved to `FlatAffineValueConstraints`.
* `FlatAffineConstraints` no longer makes assumptions about where Values associated with dimensions are coming from.
Differential Revision: https://reviews.llvm.org/D107725
Sample profiles are stored in a string map which is basically an unordered map. Printing out profiles by simply walking the string map doesn't enforce an order. I'm sorting the map in the decreasing order of total samples to enable a more stable dump, which is good for comparing two dumps.
Reviewed By: wenlei, wlei
Differential Revision: https://reviews.llvm.org/D108147
The Linux kernel has a macro called IS_ENABLED(), which evaluates to a
constant 1 or 0 based on Kconfig selections, allowing C code to be
unconditionally enabled or disabled at build time. For example:
int foo(struct *a, int b) {
switch (b) {
case 1:
if (a->flag || !IS_ENABLED(CONFIG_64BIT))
return 1;
__attribute__((fallthrough));
case 2:
return 2;
default:
return 3;
}
}
There is an unreachable warning about the fallthrough annotation in the
first case because !IS_ENABLED(CONFIG_64BIT) can be evaluated to 1,
which looks like
return 1;
__attribute__((fallthrough));
to clang.
This type of warning is pointless for the Linux kernel because it does
this trick all over the place due to the sheer number of configuration
options that it has.
Add -Wunreachable-code-fallthrough, enabled under -Wunreachable-code, so
that projects that want to warn on unreachable code get this warning but
projects that do not care about unreachable code can still use
-Wimplicit-fallthrough without having to make changes to their code
base.
Fixes PR51094.
Reviewed By: aaron.ballman, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D107933
This method bitcasts a DenseElementsAttr elementwise to one of the same
shape with a different element type.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D107612
This removes the -Werror compilation flag for x64 linux to work around a gcc bug.
GCC 8.3 reports '__tsan::v3::Event::type’ is too small to hold all values of ‘enum class __tsan::v3::EventType’
incorrectly which gets promoted to an error and causes the build to fail.
There is a on-dmeand creation of function profile during top-down processing in the sample loader when merging uninlined callees. During the profile creation, a stack string object is used to store a newly-created MD5 name, which is then used by reference as hash key in the profile map. This makes the hash key a dangling reference when later on the stack string object is deallocated.
The issue only happens with md5 profile use and was exposed by context split work for CS profile. I'm making a fix by storing newly created names in the reader.
Reviewed By: wenlei, wmi, wlei
Differential Revision: https://reviews.llvm.org/D108142
@arichardson pointed out in post-commit review for
https://reviews.llvm.org/D95583 (b714f73def) that `-verify` has an
optional argument that works a lot like `FileCheck`'s `-check-prefix`.
Use it to simplify the test for `-fno-implicit-modules-use-lock`!
Previously we're passing `llvm::Function&` into `M68kCCState` to lower
arguments in fastcc. However, that reference might not be available if
it's a library call and we only need its argument types. Therefore,
now we're simply passing a list of argument llvm::Type-s.
This fixes PR-50752.
Differential Revision: https://reviews.llvm.org/D108101
Basic block pointer is dereferenced unconditionally for MBBs with
hasAddressTaken property.
MBBs might have hasAddressTaken property without reference to BB.
Backend developers must assign fake BB to MBB to workaround this issue
and it should be fixed.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D108092
Since the precondition for loop is `size >= T::kSize` we always expect
at least one run of the loop. This patch transforms the for-loop into a
do/while-loop which saves at least one test.
We also add a second template parameter to allow the Tail operation to
differ from the loop operation.
Similar to the MQPR register class as the MVE equivalent to QPR, this
adds MQQPR and MQQQQPR register classes for the MVE equivalents of QQPR
and QQQQPR registers. The MVE MQPR seemed have worked out quite well,
and adding MQQPR and MQQQQPR allows us to a little more accurately
specify the number of registers, calculating register pressure limits a
little better.
Differential Revision: https://reviews.llvm.org/D107463
This is a redo of D108089 that broke some 32-bit builds.
`scudo::uptr` was defined as an `unsigned long` on 32-b platform,
while a `uintptr_t` is usually defined as an `unsigned int`.
This worked, this was not consistent, particularly with regard to
format string specifiers.
As suggested by Vitaly, since we are including `stdint.h`, define
the internal scudo integer types to those.
Differential Revision: https://reviews.llvm.org/D108152
Reduction axis should come after all parallel axis to work with vectorization.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D108005
In 9ea6dd5cfa /
https://reviews.llvm.org/D88387 where I added skinny corefile
creation, I added new SB API and tried to manually update the hooks
for the reproducers. I missed a spot, and I should have used
lldb-instr to update the instrumentation automatically.
The patch adds ELF notes into SHT_NOTE sections of ELF offload images
passed to clang-offload-wrapper.
The new notes use a null-terminated "LLVMOMPOFFLOAD" note name.
There are currently three types of notes:
VERSION: a string (not null-terminated) representing the ELF offload
image structure. The current version '1.0' does not put any restrictions
on the structure of the image. If we ever need to come up with a common
structure for ELF offload images (e.g. to be able to analyze the images
in libomptarget in some standard way), then we will introduce new versions.
PRODUCER: a vendor specific name of the producing toolchain.
Upstream LLVM uses "LLVM" (not null-terminated).
PRODUCER_VERSION: a vendor specific version of the producing toolchain.
Upstream LLVM uses LLVM_VERSION_STRING with optional <space> LLVM_REVISION.
All three notes are not mandatory currently.
Differential Revision: https://reviews.llvm.org/D99551
Currently isReallyTriviallyReMaterializableGeneric() implementation
prevents rematerialization on any virtual register use on the grounds
that is not a trivial rematerialization and that we do not want to
extend liveranges.
It appears that LRE logic does not attempt to extend a liverange of
a source register for rematerialization so that is not an issue.
That is checked in the LiveRangeEdit::allUsesAvailableAt().
The only non-trivial aspect of it is accounting for tied-defs which
normally represent a read-modify-write operation and not rematerializable.
The test for a tied-def situation already exists in the
/CodeGen/AMDGPU/remat-vop.mir,
test_no_remat_v_cvt_f32_i32_sdwa_dst_unused_preserve.
The change has affected ARM/Thumb, Mips, RISCV, and x86. For the targets
where I more or less understand the asm it seems to reduce spilling
(as expected) or be neutral. However, it needs a review by all targets'
specialists.
Differential Revision: https://reviews.llvm.org/D106408
There was an instance of a third-party archive containing multiple
_llvm symbols from different files that clashed with each other
producing duplicate symbols. Symbols under the LLVM segment
don't seem to be producing any meaningful value, so just ignore them.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D108016
This patch adds static keyword to internal functions that write
binary id to restrict visibility to the file that they are declared.
Differential Revision: https://reviews.llvm.org/D108154
Check if a remateralizable nstruction does not have any virtual
register uses. Even though rematerializable RA might not actually
rematerialize it in this scenario. In that case we do not want to
hoist such instruction out of the loop in a believe RA will sink
it back if needed.
This already has impact on AMDGPU target which does not check for
this condition in its isTriviallyReMaterializable implementation
and have instructions with virtual register uses enabled. The
other targets are not impacted at this point although will be when
D106408 lands.
Differential Revision: https://reviews.llvm.org/D107677
This option has been enabled by default for quite a while now.
The practical impact of removing the option is that MSSA use
cannot be disabled in default pipelines (both LPM and NPM) and
in manual LPM invocations. NPM can still choose to enable/disable
MSSA using loop vs loop-mssa.
The next step will be to require MSSA for LICM and drop the
AST-based implementation entirely.
Differential Revision: https://reviews.llvm.org/D108075
These operations are not lowered to from any source dialect and are only
used for redundant tests. Removing these named ops, along with their
associated tests, will make migration to YAML operations much more
convenient.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D107993
LoopLoadElimination, LoopVersioning and LoopVectorize currently
fetch MemorySSA when construction LoopAccessAnalysis. However,
LoopAccessAnalysis does not actually use MemorySSA and we can pass
nullptr instead.
This saves one MemorySSA calculation in the default pipeline, and
thus improves compile-time.
Differential Revision: https://reviews.llvm.org/D108074
Two standalone LoopRotate passes scheduled using
createFunctionToLoopPassAdaptor() currently enable MemorySSA.
However, while LoopRotate can preserve MemorySSA, it does not use
it, so requiring MemorySSA is unnecessary.
This change doesn't have a practical compile-time impact by itself,
because subsequent passes still request MemorySSA.
Differential Revision: https://reviews.llvm.org/D108073
`scudo::uptr` was defined as an `unsigned long` on 32-b platform,
while a `uintptr_t` is usually defined as an `unsigned int`.
This worked, this was not consistent, particularly with regard to
format string specifiers.
As suggested by Vitaly, since we are including `stdint.h`, define
the internal `scudo` integer types to those.
Differential Revision: https://reviews.llvm.org/D108089