This is an alternative to D66980, which was reverted. Instead of
inserting a pseudo instruction that optionally expands to nothing, add a
pass that inserts int3 when appropriate after basic block layout.
Reviewers: hans
Differential Revision: https://reviews.llvm.org/D67201
llvm-svn: 371466
This reverts r370525 (git commit 0bb1630685)
Also reverts r370543 (git commit 185ddc08ee)
The approach I took only works for functions marked `noreturn`. In
general, a call that is not known to be noreturn may be followed by
unreachable for other reasons. For example, there could be multiple call
sites to a function that throws sometimes, and at some call sites, it is
known to always throw, so it is followed by unreachable. We need to
insert an `int3` in these cases to pacify the Windows unwinder.
I think this probably deserves its own standalone, Win64-only fixup pass
that runs after block placement. Implementing that will take some time,
so let's revert to TrapUnreachable in the mean time.
llvm-svn: 370829
Users have complained llvm.trap produce two ud2 instructions on Win64,
one for the trap, and one for unreachable. This change fixes that.
TrapUnreachable was added and enabled for Win64 in r206684 (April 2014)
to avoid poorly understood issues with the Windows unwinder.
There seem to be two major things in play:
- the unwinder
- C++ EH, _CxxFrameHandler3 & co
The unwinder disassembles forward from the return address to scan for
epilogues. Inserting a ud2 had the effect of stopping the unwinder, and
ensuring that it ran the EH personality function for the current frame.
However, it's not clear what the unwinder does when the return address
happens to be the last address of one function and the first address of
the next function.
The Visual C++ EH personality, _CxxFrameHandler3, needs to figure out
what the current EH state number is. It does this by consulting the
ip2state table, which maps from PC to state number. This seems to go
wrong when the return address is the last PC of the function or catch
funclet.
I'm not sure precisely which system is involved here, but in order to
address these real or hypothetical problems, I believe it is enough to
insert int3 after a call site if it would otherwise be the last
instruction in a function or funclet. I was able to reproduce some
similar problems locally by arranging for a noreturn call to appear at
the end of a catch block immediately before an unrelated function, and I
confirmed that the problems go away when an extra trailing int3
instruction is added.
MSVC inserts int3 after every noreturn function call, but I believe it's
only necessary to do it if the call would be the last instruction. This
change inserts a pseudo instruction that expands to int3 if it is in the
last basic block of a function or funclet. I did what I could to run the
Microsoft compiler EH tests, and the ones I was able to run showed no
behavior difference before or after this change.
Differential Revision: https://reviews.llvm.org/D66980
llvm-svn: 370525
Variables declared with the dllimport attribute are accessed via a
stub variable named __imp_<var>. In MinGW configurations, variables that
aren't declared with a dllimport attribute might still end up imported
from another DLL with runtime pseudo relocs.
For x86_64, this avoids the risk that the target is out of range
for a 32 bit PC relative reference, in case the target DLL is loaded
further than 4 GB from the reference. It also avoids having to make the
text section writable at runtime when doing the runtime fixups, which
makes it worthwhile to do for i386 as well.
Add stub variables for all dso local data references where a definition
of the variable isn't visible within the module, since the DLL data
autoimporting might make them imported even though they are marked as
dso local within LLVM.
Don't do this for variables that actually are defined within the same
module, since we then know for sure that it actually is dso local.
Don't do this for references to functions, since there's no need for
runtime pseudo relocations for autoimporting them; if a function from
a different DLL is called without the appropriate dllimport attribute,
the call just gets routed via a thunk instead.
GCC does something similar since 4.9 (when compiling with -mcmodel=medium
or large; from that version, medium is the default code model for x86_64
mingw), but only for x86_64.
Differential Revision: https://reviews.llvm.org/D51288
llvm-svn: 340942
Tail merge was making the assumption that a layout successor or
predecessor was always a cfg successor/predecessor. Remove that
assumption. Changes to tests are necessary because the errant cfg edges
were preventing optimizations.
llvm-svn: 273700
We need the "return address" of a noreturn call to be within the
bounds of the calling function; TrapUnreachable turns 'unreachable'
into a 'ud2' instruction, which has that desired effect.
Differential Revision: http://reviews.llvm.org/D18414
llvm-svn: 264224
The Win64 unwinder disassembles forwards from each PC to try to
determine if this PC is in an epilogue. If so, it skips calling the EH
personality function for that frame. Typically, this means you cannot
catch an exception in the same frame that you threw it, because 'throw'
calls a noreturn runtime function.
Previously we avoided this problem with the TrapUnreachable
TargetOption, but that's a much bigger hammer than we need. All we need
is a 1 byte non-epilogue instruction right after the call. Instead,
what we got was an unconditional branch to a shared block containing the
ud2, potentially 7 bytes instead of 1. So, this reverts r206684, which
added TrapUnreachable, and replaces it with something better.
The new code pattern matches for invoke/call followed by unreachable and
inserts an int3 into the DAG. To be 100% watertight, we would need to
insert SEH_Epilogue instructions into all basic blocks ending in a call
with no terminators or successors, but in practice this is unlikely to
come up.
llvm-svn: 248959
Win64 stack unwinder gets confused when execution flow "falls through" after
a call to 'noreturn' function. This fixes the "missing epilogue" problem by
emitting a trap instruction for IR 'unreachable' on x86_x64-pc-windows.
A secondary use for it would be for anyone wanting to make double-sure that
'noreturn' functions, indeed, do not return.
llvm-svn: 206684
When replacing Old with New, it can happen that New is already a
successor. Add the old and new edge weights instead of creating a
duplicate edge.
llvm-svn: 161653
both fallthrough and a conditional branch target the same successor.
Gracefully delete the conditional branch and introduce any unconditional
branch needed to reach the actual successor. This fixes memory
corruption in 2009-06-15-RegScavengerAssert.ll and possibly other tests.
Also, while I'm here fix a latent bug I spotted by inspection. I never
applied the same fundamental fix to this fallthrough successor finding
logic that I did to the logic used when there are no conditional
branches. As a consequence it would have selected landing pads had they
be aligned in just the right way here. I don't have a test case as
I spotted this by inspection, and the previous time I found this
required have of TableGen's source code to produce it. =/ I hate backend
bugs. ;]
Thanks to Jim Grosbach for helping me reason through this and reviewing
the fix.
llvm-svn: 154867
This is mostly to test the waters. I'd like to get results from FNT
build bots and other bots running on non-x86 platforms.
This feature has been pretty heavily tested over the last few months by
me, and it fixes several of the execution time regressions caused by the
inlining work by preventing inlining decisions from radically impacting
block layout.
I've seen very large improvements in yacr2 and ackermann benchmarks,
along with the expected noise across all of the benchmark suite whenever
code layout changes. I've analyzed all of the regressions and fixed
them, or found them to be impossible to fix. See my email to llvmdev for
more details.
I'd like for this to be in 3.1 as it complements the inliner changes,
but if any failures are showing up or anyone has concerns, it is just
a flag flip and so can be easily turned off.
I'm switching it on tonight to try and get at least one run through
various folks' performance suites in case SPEC or something else has
serious issues with it. I'll watch bots and revert if anything shows up.
llvm-svn: 154816