Currently, ArgPromotion may leave metadata uses of promoted values,
which will end up in the wrong function, creating invalid IR.
PR33641 fixed this for dead arguments, but it can be also be triggered
arguments with users that are promoted (see the updated test case).
We also have to drop uses to them after promoting them. We need to do
this after dealing with the non-metadata uses, so I also moved the empty
use case to the loop that deals with updating the arguments of the new
function.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D85127
Extend the memop value profile buckets to be more flexible (could accommodate a
mix of individual values and ranges) and to cover more value ranges (from 11 to
22 buckets).
Disabled behind a flag (to be enabled separately) and the existing code to be
removed later.
Differential Revision: https://reviews.llvm.org/D81682
Currently, several InstrProf tests `FAIL` on Solaris (both sparc and x86):
Profile-i386 :: Posix/instrprof-visibility.cpp
Profile-i386 :: instrprof-merging.cpp
Profile-i386 :: instrprof-set-file-object-merging.c
Profile-i386 :: instrprof-set-file-object.c
On sparc there's also
Profile-sparc :: coverage_comments.cpp
The failure mode is always the same:
error: /var/llvm/local-amd64/projects/compiler-rt/test/profile/Profile-i386/Posix/Output/instrprof-visibility.cpp.tmp: Failed to load coverage: Malformed coverage data
The error is from `llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp`
(`loadBinaryFormat`), l.926:
InstrProfSymtab ProfileNames;
std::vector<SectionRef> NamesSectionRefs = *NamesSection;
if (NamesSectionRefs.size() != 1)
return make_error<CoverageMapError>(coveragemap_error::malformed);
where .size() is 2 instead.
Looking at the executable, I find (with `elfdump -c -N __llvm_prf_names`):
Section Header[15]: sh_name: __llvm_prf_names
sh_addr: 0x8053ca5 sh_flags: [ SHF_ALLOC ]
sh_size: 0x86 sh_type: [ SHT_PROGBITS ]
sh_offset: 0x3ca5 sh_entsize: 0
sh_link: 0 sh_info: 0
sh_addralign: 0x1
Section Header[31]: sh_name: __llvm_prf_names
sh_addr: 0x8069998 sh_flags: [ SHF_WRITE SHF_ALLOC ]
sh_size: 0 sh_type: [ SHT_PROGBITS ]
sh_offset: 0x9998 sh_entsize: 0
sh_link: 0 sh_info: 0
sh_addralign: 0x1
Unlike GNU `ld` (which primarily operates on section names) the Solaris
linker, following the ELF spirit, only merges input sections into an output
section if both section name and section flags match, so two separate
sections are maintained.
The read-write one comes from `lib/clang/12.0.0/lib/sunos/libclang_rt.profile-i386.a(InstrProfilingPlatformLinux.c.o)`
while the read-only one is generated by
`llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp` (`InstrProfiling::emitNameData`)
at l.1004 where `isConstant = true`.
The easiest way to avoid the mismatch is to change the definition in
`compiler-rt/lib/profile/InstrProfilingPlatformLinux.c` to `const`.
This fixes all failures observed.
Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11`, and
`x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D85116
Instructions should not be scheduled across ENDBR instructions, as this would result in the ENDBR being displaced, breaking the parity needed for the Indirect Branch Tracking feature of CET.
Currently, the X86IndirectBranchTracking pass is later than the instruction scheduling in the pipeline, what causes the bug to be unnoticeable and very hard (if not unfeasible) to be triggered while compiling C files with the standard LLVM setup. Yet, for correctness and to prevent issues in future changes, the compiler should prevent the such scheduling.
Differential Revision: https://reviews.llvm.org/D84862
This allows us to remove the (depth violating) code in getFauxShuffleMask where we were combining the OR(SHUFFLE,SHUFFLE) shuffle inputs as well, and not just the OR().
This is a minor step toward being able to shuffle combine from/to SELECT/BLENDV as a faux shuffle.
The strictfp attribute is required on all function calls in a function
that is itself marked with the strictfp attribute. The IRBuilder knows
this and has a method for adding the attribute to function call instructions.
If a function being called has the strictfp attribute itself then the
IRBuilder will refuse to add the attribute to the calling instruction
despite being asked to add it. Eliminate this error.
Differential Revision: https://reviews.llvm.org/D84878
The root cause was fixed by 3d6f53018f.
The workaround added in 99ad956fda can be changed
to an assert now. (In case the fix regresses, there will be a heap-use-after-free.)
This adds an isel pattern and special XOR8rr_NOREX instruction
to enable the use of h-registers for __builtin_parity. This avoids
a copy and a shift instruction. The NOREX instruction is in case
register allocation doesn't use the matching l-register for some
reason. If a R8-R15 register gets picked instead, we won't be
able to encode the instruction since an h-register can't be used
with a REX prefix.
Fixes PR46954
Remove use of iterator::difference_type to know where to insert a
moved or erased block during undo actions.
Differential Revision: https://reviews.llvm.org/D85066
This revision adds a transformation and a pattern that rewrites a "maybe masked" `vector.transfer_read %view[...], %pad `into a pattern resembling:
```
%1:3 = scf.if (%inBounds) {
scf.yield %view : memref<A...>, index, index
} else {
%2 = vector.transfer_read %view[...], %pad : memref<A...>, vector<...>
%3 = vector.type_cast %extra_alloc : memref<...> to
memref<vector<...>> store %2, %3[] : memref<vector<...>> %4 =
memref_cast %extra_alloc: memref<B...> to memref<A...> scf.yield %4 :
memref<A...>, index, index
}
%res= vector.transfer_read %1#0[%1#1, %1#2] {masked = [false ... false]}
```
where `extra_alloc` is a top of the function alloca'ed buffer of one vector.
This rewrite makes it possible to realize the "always full tile" abstraction where vector.transfer_read operations are guaranteed to read from a padded full buffer.
The extra work only occurs on the boundary tiles.
Differential Revision: https://reviews.llvm.org/D84631
A JSON->TensorSpec utility we will use subsequently to specify
additional outputs needed for certain training scenarios.
Differential Revision: https://reviews.llvm.org/D84976
Freeze always returns a defined value. This also prevents msan from
checking the input shadow, which happened because freeze wasn't
explicitly visited.
Differential Revision: https://reviews.llvm.org/D85040
In some cases, it seems like we can get rid of unnecessary s/umins by
using information from the loop guards (unless I am missing something).
One place where this seems to be helpful in practice is when computing
loop trip counts. This patch just changes howManyGreaterThans for now.
Note that this requires a loop for which we can check 'is guarded'.
On SPEC2000/SPEC2006/MultiSource, there are some notable changes for
some programs in the number of loops unrolled and trip counts computed.
```
Same hash: 179 (filtered out)
Remaining: 58
Metric: scalar-evolution.NumTripCountsComputed
Program base patch diff
test-suite...langs-C/compiler/compiler.test 25.00 31.00 24.0%
test-suite.../Applications/SPASS/SPASS.test 2020.00 2323.00 15.0%
test-suite...langs-C/allroots/allroots.test 29.00 32.00 10.3%
test-suite.../Prolangs-C/loader/loader.test 17.00 18.00 5.9%
test-suite...fice-ispell/office-ispell.test 253.00 265.00 4.7%
test-suite...006/450.soplex/450.soplex.test 3552.00 3692.00 3.9%
test-suite...chmarks/MallocBench/gs/gs.test 453.00 470.00 3.8%
test-suite...ngs-C/assembler/assembler.test 29.00 30.00 3.4%
test-suite.../Benchmarks/Ptrdist/bc/bc.test 263.00 270.00 2.7%
test-suite...rks/FreeBench/pifft/pifft.test 722.00 741.00 2.6%
test-suite...count/automotive-bitcount.test 41.00 42.00 2.4%
test-suite...0/253.perlbmk/253.perlbmk.test 1417.00 1451.00 2.4%
test-suite...000/197.parser/197.parser.test 387.00 396.00 2.3%
test-suite...lications/sqlite3/sqlite3.test 1168.00 1189.00 1.8%
test-suite...000/255.vortex/255.vortex.test 173.00 176.00 1.7%
Metric: loop-unroll.NumUnrolled
Program base patch diff
test-suite...langs-C/compiler/compiler.test 1.00 3.00 200.0%
test-suite.../Applications/SPASS/SPASS.test 134.00 234.00 74.6%
test-suite...count/automotive-bitcount.test 3.00 4.00 33.3%
test-suite.../Prolangs-C/loader/loader.test 3.00 4.00 33.3%
test-suite...langs-C/allroots/allroots.test 3.00 4.00 33.3%
test-suite...Source/Benchmarks/sim/sim.test 10.00 12.00 20.0%
test-suite...fice-ispell/office-ispell.test 21.00 25.00 19.0%
test-suite.../Benchmarks/Ptrdist/bc/bc.test 32.00 38.00 18.8%
test-suite...006/450.soplex/450.soplex.test 300.00 352.00 17.3%
test-suite...rks/FreeBench/pifft/pifft.test 60.00 69.00 15.0%
test-suite...chmarks/MallocBench/gs/gs.test 57.00 63.00 10.5%
test-suite...ngs-C/assembler/assembler.test 10.00 11.00 10.0%
test-suite...0/253.perlbmk/253.perlbmk.test 145.00 157.00 8.3%
test-suite...000/197.parser/197.parser.test 43.00 46.00 7.0%
test-suite...TimberWolfMC/timberwolfmc.test 205.00 214.00 4.4%
Geomean difference 7.6%
```
Fixes https://bugs.llvm.org/show_bug.cgi?id=46939
Fixes https://bugs.llvm.org/show_bug.cgi?id=46924 on X86.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D85046
This reverts commit 35b65be041.
Build is broken with -DBUILD_SHARED_LIBS=ON with some undefined
references like:
VectorTransforms.cpp:(.text._ZN4llvm12function_refIFvllEE11callback_fnIZL24createScopedInBoundsCondN4mlir25VectorTransferOpInterfaceEE3$_8EEvlll+0xa5): undefined reference to `mlir::edsc::op::operator+(mlir::Value, mlir::Value)'
This patch stops unconditionally transforming FSUB(-0,X) into an FNEG(X) while building the DAG. There is also one small change to handle the new FSUB(-0,X) similarly to FNEG(X) in the AMDGPU backend.
Differential Revision: https://reviews.llvm.org/D84056
Summary:
Not all projects in the project map file might have newer results
for updating, we should handle this situation gracefully.
Additionally, not every user of the test system would want storing
reference results in git. For this reason, git functionality is now
optional.
Differential Revision: https://reviews.llvm.org/D84303
`DenseMapAPIntKeyInfo` is now located in `lib/IR/LLVMContextImpl.h`.
Moved it into `include/ADT/DenseMapInfo.h` to use it.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D85131
The 1st try at this (rG2265d01f2a5b) exposed what looks like
unspecified behavior in C/C++ resulting in test variations.
The arguments to BinaryOperator::CreateAnd() were both IRBuilder
function calls, and the order in which they execute determines
the order of the new instructions in the IR. But the order of
function arg evaluation is not fixed by the rules of C/C++, so
depending on compiler config, the test would fail because the
test expected a single fixed ordering of instructions.
Original commit message:
I tried to use m_Deferred() on this, but didn't find
a clean way to do that.
http://bugs.llvm.org/PR46955https://alive2.llvm.org/ce/z/2h6QTq
The offsets field should be omitted when the 'OffsetEntryCount' entry is
specified to be 0.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D85006
The current modeling of LLVM IR types in MLIR is based on the LLVMType class
that wraps a raw `llvm::Type *` and delegates uniquing, printing and parsing to
LLVM itself. This model makes thread-safe type manipulation hard and is being
progressively replaced with a cleaner MLIR model that replicates the type
system. Introduce a set of classes reflecting the LLVM IR type system in MLIR
instead of wrapping the existing types. These are currently introduced as
separate classes without affecting the dialect flow, and are exercised through
a test dialect. Once feature parity is reached, the old implementation will be
gradually substituted with the new one.
Depends On D84171
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D84339
There were various hacks used to try to avoid making s1 SGPR vs. s1
VCC ambiguous after constraining the register before we had a strategy
to deal with this. This also attempted to handle undef operands, which
are now illegal gMIR.
Use pad with undef and unmerge with unused results. This is annoyingly
similar to several other places in LegalizerHelper, but they're all
slightly different.
We already do this on AVX (+ for ZERO_EXTEND_VECTOR_INREG), but this enables it for all SSE targets - we attempted something similar back at rL357057 but hit issues with the ZERO_EXTEND_VECTOR_INREG handling (PR41249).
I'm still looking at the vector-mul.ll regression - which is due to 32-bit targets performing the load as a f64, resulting in the shuffle combiner thinking it has to create a shuffle in the float domain.
`clang/test/CodeGenCXX/fp16-mangle.cpp` tests pointers to __fp16, but
if you give the `-fallow-half-arguments-and-returns` option, then
clang can also leave an __fp16 unmodified as a function argument or
return type. This regression test checks the name-mangling of that.
Reviewed By: miyuki
Differential Revision: https://reviews.llvm.org/D85010
Fixes a regression caused by D82439, in which IT blocks were no longer being
generated when -Oz is present. This was due to the CPSR register being marked as
dead, while this case was not accounted for.
Differential Revision: https://reviews.llvm.org/D83667
Summary: `BaseMask` occupies the lowest bits. Effect of applying the mask is neutralized by right shift operation, thus making it useless.
Fix: Remove a redundant bitwise operation.
Differential Revision: https://reviews.llvm.org/D85026
Summary: Simplify functions SVal::getAsSymbolicExpression SVal::getAsSymExpr and SVal::getAsSymbol. After revision I concluded that `getAsSymbolicExpression` and `getAsSymExpr` repeat functionality of `getAsSymbol`, thus them can be removed.
Fix: Remove functions SVal::getAsSymbolicExpression and SVal::getAsSymExpr.
Differential Revision: https://reviews.llvm.org/D85034
This skips searching for `target` related flags in the existing args if
we don't have a valid target to insert.
Depends on D85076
Differential Revision: https://reviews.llvm.org/D85077
This patch does the following:
1) Starts using YAML macro to reduce the number of YAML documents in tests.
2) Adds `#` before 'RUN'/`CHECK` lines in a few tests where it is missing.
3) Removes unused YAML keys.
4) Starts using `ENTSIZE=<none>` to simplify tests (see D84526).
5) Removes trailing white spaces in a few places.
Differential revision: https://reviews.llvm.org/D85013
Not passing --clang would result in a python exception after this change:
(TypeError: expected str, bytes or os.PathLike object, not NoneType)
because the --clang argument default was only being populated in the
initial argument parsing pass but not later on.
Fix this by adding an argparse callback to set the default values.
Reviewed By: vitalybuka, MaskRay
Differential Revision: https://reviews.llvm.org/D84511
The lldb test-suite on Windows reports a 'CLEANUP ERROR' when attempting to kill
an exited/detached process. This change makes ProcessWindows consistent with
the other processes which only log the error. After this change a number of
'CLEANUP ERROR' messages are now removed.
Differential Revision: https://reviews.llvm.org/D84957
The check-* targets run ${Python3_EXECUTABLE} $BUILD/bin/llvm-lit, but
running `./bin/llvm-lit $ARGS` from the build directory currently always
uses "python" to run llvm-lit. On most systems this will be python2.7 even
if we found python3 at CMake time.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D84625