For some transformations like hot-cold split or coro split, it can outline its part of function ranges. Since sample loader is the early stage of backend and no split happens at that time, compiler can't recognize those function, so in llvm-profgen we should attribute the sample to the original function. This is already done for the body range samples since we use the symbols from dwarf which is created before the split.
But for branch samples, the call from master function to its outlined function is actually not a call to the original function, we shouldn't add head/callsie samples for it. So instead of dwarf symbol, we use the symbols from symbol table and ignore those functions with special suffixes(like `.cold` ,`.resume`) for accumulating the callsite/head samples.
Reviewed By: hoy, wenlei
Differential Revision: https://reviews.llvm.org/D110864
`ScudoWrappersCppTest.AllocAfterFork` was failing obscurely sometimes.
Someone pointed us to Linux's `vm.max_map_count` that can be
significantly lower on some machines than others. It turned out that
on a machine with that setting set to 65530, some `ENOMEM` errors
would occur with `mmap` & `mprotect` during that specific test.
Reducing the number of times we fork, and the maximum size allocated
during that test makes it pass on those machines.
Differential Revision: https://reviews.llvm.org/D111342
This change fixes a bug where the compiler generates a prologue_end
for line 0 locs. That is because line 0 is not associated with any
source location, so there should not be a prolgoue_end at a location
that doesn't correspond to a source location.
There were some LLVM tests that were explicitly checking for line 0
prologue_end's as well since I believe that to be incorrect, I had to
change those tests as well.
Patch by Shubham Rastogi!
Differential Revision: https://reviews.llvm.org/D110740
This will simplify removing id proposed by @dvyukov on D111183
Also now we have more flexiliby for traces compressio they
are not interleaving with uncompressable headers.
Depends on D111256.
Differential Revision: https://reviews.llvm.org/D111274
Bit positions for the intrinsics IBCLR and IBSET and shift counts
for the intrinsics ISHFT/SHIFTA/SHIFTL/SHIFTR should be validated
when folding.
Differential Revision: https://reviews.llvm.org/D111327
Updating the MachineDominatorTree is easy since SILowerControlFlow only
splits and removes basic blocks. This should save a bit of compile time
because previously we would recompute the dominator tree from scratch
after this pass.
Another reason for doing this is that SILowerControlFlow preserves
LiveIntervals which transitively requires MachineDominatorTree. I think
that means that SILowerControlFlow is obliged to preserve
MachineDominatorTree too as explained here:
https://lists.llvm.org/pipermail/llvm-dev/2020-November/146923.html
although it does not seem to have caused any problems in practice yet.
Differential Revision: https://reviews.llvm.org/D111313
In a couple of places machine verification was disabled for no apparent
reason, probably just because an "addPass(..., false)" line was cut and
pasted from elsewhere.
After this patch the only remaining place where machine verification is
disabled in the generic TargetPassConfig code, is after addPreEmitPass.
DecompGEP.Base and UnderlyingV are currently always the same.
However, logically DecompGEP.Base is the right value to use here,
because the decomposed offset is relative to that base.
Some subprojects like compiler-rt define the `darwin` feature in their
lit config, but clang does not do that, so we need to use the global
`system-darwin` here instead.
Differential Revision: https://reviews.llvm.org/D111267
LoopFlatten does preserve loop analyses (DT, LI and SCEV), but
currently doesn't mark them as preserved in the NewPM (they are
marked as preserved in the LegacyPM). I think this doesn't really
have an effect in the end because the loop pass adaptor will just
assume they're preserved anyway, but let's be explicit about this
for the sake of clarity.
Differential Revision: https://reviews.llvm.org/D111328
Vendors take libc++ and ship it in various ways. Some vendors might
ship it differently from what upstream LLVM does, i.e. the install
location might be different, some ABI properties might differ, etc.
In the past few years, I've come across several instances where
having a place to test some of these properties would have been
incredibly useful. I also just got bitten by the lack of tests
of that kind, so I'm adding some now.
The tests added by this commit for Apple platforms have numerous
TODOs that capture discrepancies between the upstream LLVM CMake
and the slightly-modified build we perform internally to produce
Apple's system libc++. In the future, the goal would be to upstream
all those differences so that it's possible to build a faithful
Apple system libc++ with the upstream LLVM sources only.
But this isn't only useful for Apple - this lays out the path for
any vendor being able to add their own checks (either upstream or
downstream) to libc++.
This is a re-application of 9892d1644f, which was reverted in 138dc27186
because it broke the build. The issue was that we didn't apply the required
changes to libunwind and our CI didn't notice it because we were not
running the libunwind tests. This has been fixed now, and we're running
the libunwind tests in CI now too.
Differential Revision: https://reviews.llvm.org/D110736
Some subprojects like compiler-rt define the `darwin` feature in their
lit config, but lld does not do that, so we need to use the global
system-darwin here instead. This test seems to have drifted from the
actual behavior so I also had to add `/usr/local/lib` here to make it
pass.
Differential Revision: https://reviews.llvm.org/D111268
When PHIElimination adds kills after lowering PHIs to COPYs it knows
that some instructions after the inserted COPY might use the same
SrcReg, but it was only looking at the terminator instructions at the
end of the block, not at other instructions like INLINEASM_BR that can
appear after the COPY insertion point.
Since we have already called findPHICopyInsertPoint, which knows about
INLINEASM_BR, we might as well reuse the insertion point that it
calculated when looking for instructions that might use SrcReg.
This fixes a machine verification failure if you force machine
verification to run after PHIElimination (currently it is disabled for
other reasons) when running
test/CodeGen/X86/callbr-asm-phi-placement.ll.
Differential Revision: https://reviews.llvm.org/D110834
Summary:
The IR is saved in its print form before each pass is started and a
signal handler is registered. If the compilation crashes, the signal
handler will print the saved IR to dbgs(). This option
can be modified using -print-module-scope to get the IR for the complete
module. Filtering options can be used to improve performance by limiting
which passes (or functions) save the IR. Note that this option only works
with the new pass manager.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: aeubanks (Arthur Eubanks) yrouban (Yevgeny Rouban)
Differential Revision: https://reviews.llvm.org/D86657
Add an interface for outlineable OpenMP operations.
This patch was initially done in fir-dev and is now needed
for the upstreaming.
Reviewed By: schweitz
Differential Revision: https://reviews.llvm.org/D111310
* Need to investigate the proper solution to https://github.com/pybind/pybind11/issues/3336 or engineer something different.
* The attempt to produce an empty buffer_info as a workaround triggers asan/ubsan.
* Usage of this API does not arise naturally in practice yet, and it is more important to be asan/crash clean than have a solution right now.
* Switching back to raising an exception, even though that triggers terminate().
X86InstrInfo::convertToThreeAddress would convert this:
%1:gr32 = ADD32rr killed %0:gr32(tied-def 0), %0:gr32, implicit-def dead $eflags
to this:
undef %2.sub_32bit:gr64 = COPY killed %0:gr32
undef %3.sub_32bit:gr64_nosp = COPY %0:gr32
%1:gr32 = LEA64_32r killed %2:gr64, 1, killed %3:gr64_nosp, 0, $noreg
Note that in the ADD32rr, %0 was used twice and the first use had a kill
flag, which is what MachineInstr::addRegisterKilled does.
In the converted code, each use of %0 is copied to a new reg, and the
first COPY inherits the kill flag from the ADD32rr. This causes
machine verification to fail (if you force it to run after
TwoAddressInstructionPass) because the second COPY uses %0 after it is
killed. Note that machine verification is currently disabled after
TwoAddressInstructionPass but this is a step towards being able to
enable it.
Fix this by not inserting more than one COPY from the same source
register.
Differential Revision: https://reviews.llvm.org/D110829
In the review of D111085 it was pointed out that these functions don't
conform to the naming scheme in use in LLVM. With this commit we should
be good for all of FPEnv.h.
Print this basic block flag as inlineasm-br-indirect-target and parse
it. This allows you to write MIR test cases for INLINEASM_BR. The test
case I added is one that I wanted to precommit anyway for D110834.
Differential Revision: https://reviews.llvm.org/D111291
This patch fixes problems reported in PR51981.
When rotating a loop it isn't enough to just forget SCEV for that
loop nest. When rotating we might clone some instructions from the
old header into the preheader, and insert new PHI nodes to merge
values together. There could be users of the original value that are
updated to use the PHI result. And those users were not necessarily
depending on a PHI node earlier, so they weren't cleaned up when just
forgetting all SCEV:s for the loop nest. So we need to explicitly
forget those values to avoid invalid cached SCEV expressions.
Reviewed By: fhahn, mkazantsev
Differential Revision: https://reviews.llvm.org/D110813
mingw-g++ does not correctly support the full `std::errc` namespace as
worded in the standard[1]. As such, we cannot reliably use all names
therein. This patch changes the use of
`std::errc::state_not_recoverable`, to use portable error codes from the
`llvm::errc` equivalent.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71444
Reviewed by v.g.vassilev
Differential Revision: https://reviews.llvm.org/D111315
This patch fixes:
llvm-project/lldb/source/Plugins/ABI/PowerPC/ABISysV_ppc.cpp:204:6:
error: missing field 'invalidate_regs' initializer
[-Werror,-Wmissing-field-initializers]
Desccribe in cxx_status.html the missing parts of the partially
implemented proposals described in cxx_status.html.
Uses <details> blocks so the information appears collapsed
by default.
This patch updates the vec_popcnt builtins to return vector unsigned,
as defined by the Power Vector Intrinsics Programming Reference.
This patch is NFC and all existing tests pass.
Differential Revision: https://reviews.llvm.org/D110934
This change is to keep the help text and command guide of llvm-readelf
in tandem.
- In the help text mention that --section-data, --section-relocations,
--section-symbols and --stack-sizes have no effect on GNU style
output; give the accepted values for --elf-output-style and update
the description of --gnu-hash-table to use the command guide
description.
- In the command guide add the missing options -a,
--dependant-libraries,--no-demangle, --wide and -W. Also update the
description of --symbols so it matches the help text.
Differential Revision: https://reviews.llvm.org/D111240
Replace `&__rhs` with `_VSTD::addressof(__rhs)` to guard against ADL hijacking
of `operator&` in `operator=`. Thanks to @CaseyCarter for bringing it to our
attention.
Similar issues with hijacking `operator&` still exist, they will be
addressed separately.
Reviewed By: #libc, Quuxplusone, ldionne
Differential Revision: https://reviews.llvm.org/D110852