Implement 'retn' simply by aliasing it to the relevant 'ret' instruction
Commit on behalf of coby
Differential Revision: https://reviews.llvm.org/D24346
llvm-svn: 282601
Simplify Consecutive Merge Store Candidate Search
Now that address aliasing is much less conservative, push through
simplified store merging search which only checks for parallel stores
through the chain subgraph. This is cleaner as the separation of
non-interfering loads/stores from the store-merging logic.
Whem merging stores, search up the chain through a single load, and
finds all possible stores by looking down from through a load and a
TokenFactor to all stores visited. This improves the quality of the
output SelectionDAG and generally the output CodeGen (with some
exceptions).
Additional Minor Changes:
1. Finishes removing unused AliasLoad code
2. Unifies the the chain aggregation in the merged stores across
code paths
3. Re-add the Store node to the worklist after calling
SimplifyDemandedBits.
4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
arbitrary, but seemed sufficient to not cause regressions in
tests.
This finishes the change Matt Arsenault started in r246307 and
jyknight's original patch.
Many tests required some changes as memory operations are now
reorderable. Some tests relying on the order were changed to use
volatile memory operations
Noteworthy tests:
CodeGen/AArch64/argument-blocks.ll -
It's not entirely clear what the test_varargs_stackalign test is
supposed to be asserting, but the new code looks right.
CodeGen/AArch64/arm64-memset-inline.lli -
CodeGen/AArch64/arm64-stur.ll -
CodeGen/ARM/memset-inline.ll -
The backend now generates *worse* code due to store merging
succeeding, as we do do a 16-byte constant-zero store efficiently.
CodeGen/AArch64/merge-store.ll -
Improved, but there still seems to be an extraneous vector insert
from an element to itself?
CodeGen/PowerPC/ppc64-align-long-double.ll -
Worse code emitted in this case, due to the improved store->load
forwarding.
CodeGen/X86/dag-merge-fast-accesses.ll -
CodeGen/X86/MergeConsecutiveStores.ll -
CodeGen/X86/stores-merging.ll -
CodeGen/Mips/load-store-left-right.ll -
Restored correct merging of non-aligned stores
CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll -
Improved. Correctly merges buffer_store_dword calls
CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll -
Improved. Sidesteps loading a stored value and merges two stores
CodeGen/X86/pr18023.ll -
This test has been removed, as it was asserting incorrect
behavior. Non-volatile stores *CAN* be moved past volatile loads,
and now are.
CodeGen/X86/vector-idiv.ll -
CodeGen/X86/vector-lzcnt-128.ll -
It's basically impossible to tell what these tests are actually
testing. But, looks like the code got better due to the memory
operations being recognized as non-aliasing.
CodeGen/X86/win32-eh.ll -
Both loads of the securitycookie are now merged.
CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll -
This test appears to work but no longer exhibits the spill
behavior.
Reviewers: arsenm, hfinkel, tstellarAMD, nhaehnle, jyknight
Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, resistor, tstellarAMD, t.p.northover, spatel
Differential Revision: https://reviews.llvm.org/D14834
llvm-svn: 282600
Summary:
This adds the AVRMCTargetDesc file in tree. It allows creation of the
core classes used in the backend.
Reviewers: arsenm, kparzysz
Subscribers: wdng, beanz, mgorny
Differential Revision: https://reviews.llvm.org/D25023
llvm-svn: 282597
This subfolder just like "linkerscript" subfolder keeps
testcases with invalid input. According to PR30540 it seems
we might have many new ones soon, so it is seems reasonable to
separate them from regular testcases.
Differential revision: https://reviews.llvm.org/D25010
llvm-svn: 282595
The previous data layout caused issues when dealing with atomics.
Foe example, it is illegal to load a 16-bit value with less than 16-bits
of alignment.
This changes the data layout so that all types are aligned by at least
their own width.
Interestingly, this also _slightly_ decreased register pressure in some
cases.
llvm-svn: 282587
This patch extends __sanitizer_finish_switch_fiber method to optionally return previous stack base and size.
This solves the problem of coroutines/fibers library not knowing the original stack context from which the library is used. It's incorrect to assume that such context is always the default stack of current thread (e.g. one such library may be used from a fiber/coroutine created by another library). Bulding a separate stack tracking mechanism would not only duplicate AsanThread, but also require each coroutines/fibers library to integrate with it.
Author: Andrii Grynenko (andriigrynenko)
Reviewed in: https://reviews.llvm.org/D24628
llvm-svn: 282582
The X86 clang/test/CodeGen/*builtins.c tests define the mm_malloc.h include
guard as a hack for avoiding its inclusion (mm_malloc.h requires a hosted
environment since it expects stdlib.h to be available - which is not the case
in these internal clang codegen tests).
This patch removes this hack and instead passes -ffreestanding to clang cc1.
Differential Revision: https://reviews.llvm.org/D24825
llvm-svn: 282581
The KORTEST was introduced due to a bug where a TEST instruction used a K register.
but, turns out that the opposite case of KORTEST using a GPR is now happening
The change removes the KORTEST flow and adds a COPY instruction from the K reg to a GPR.
Differential Revision: https://reviews.llvm.org/D24953
llvm-svn: 282580
Summary:
Now two replacements are considered order-independent if applying them in
either order produces the same result. These include (but not restricted
to) replacements that:
- don't overlap (being directly adjacent is fine) and
- are overlapping deletions.
- are insertions at the same offset and applying them in either order
has the same effect, i.e. X + Y = Y + X if one inserts text X and the
other inserts text Y.
Discussion about this design can be found in D24717
Reviewers: djasper, klimek
Subscribers: omtcyfz, cfe-commits
Differential Revision: https://reviews.llvm.org/D24800
llvm-svn: 282577
The EHABI unwinder is thread-agnostic, SJLJ unwinder and the DWARF unwinder have
a couple of pthread dependencies.
This patch makes it possible to build the whole of libunwind for a
single-threaded environment.
Reviewers: compnerd
Differential revision: https://reviews.llvm.org/D24984
llvm-svn: 282575
Example:
switch (x) {
int a; // <- This is unreachable but needed
case 1:
a = ...
Differential Revision: https://reviews.llvm.org/D24905
llvm-svn: 282574
This commit enables more unrolling for SystemZ by implementing the
SystemZTargetTransformInfo::getUnrollingPreferences() method.
It has been found that it is better to only unroll moderately, so the
DefaultUnrollRuntimeCount has been moved into UnrollingPreferences in order
to set this to a lower value for SystemZ (4).
Reviewers: Evgeny Stupachenko, Ulrich Weigand.
https://reviews.llvm.org/D24451
llvm-svn: 282570
This check currently doesn't seem to do anything useful on any in-tree target:
On non-x86, it always evaluates to false, so we never hit the code path that
creates the shuffle with zero.
On x86, it just forwards to isShuffleMaskLegal(), which is a reasonable thing to
query in general, but doesn't make sense if only restricted to zero blends.
Differential Revision: https://reviews.llvm.org/D24625
llvm-svn: 282567
A testbot found a regression introduced in the testsuite with
the changes in r282565 on Ubuntu (TestStepNoDebug.ReturnValueTestCase).
I'll get this set up on an ubuntu box and figure out what is happening
there -- likely a problem with the eh_frame augmentation, which isn't
used on macosx.
llvm-svn: 282566
x86AssemblyInspectionEngine and the current UnwindAssembly_x86 to
allow for the core engine to be exercised by unit tests.
The UnwindAssembly_x86 class will have access to Targets, Processes,
Threads, RegisterContexts -- it will be working in the full lldb
environment.
x86AssemblyInspectionEngine is layered away from all of that, it is
given some register definitions and a bag of bytes to profile.
I wrote an initial unittest for a do-nothing simple x86_64/i386
function to start with. I'll be adding more.
The x86 assembly unwinder was added to lldb early in its bringup;
I made some modernization changes as I was refactoring the code
to make it more consistent with how we write lldb today.
I also added RegisterContextMinidump_x86_64.cpp to the xcode project
file so I can run the unittests from that.
The testsuite passes with this change, but there was quite a bit of
code change by the refactoring and it's possible there are some
issues. I'll be testing this more in the coming days, but it looks
like it is behaving correctly as far as I can tell with automated
testing.
<rdar://problem/28509178>
llvm-svn: 282565
Ever since LAA was split out into an analysis on its own, this function
stopped emitting the report directly. Instead it stores it to be
retrieved by the client which can then emit it as its own report
(e.g. -Rpass-analysis=loop-vectorize).
llvm-svn: 282561
This matches the behavior of Binutils linkers. We also change the
default MaxPageSize on x86-64 to 0x1000 to preserver the current
behavior, which is the same as the behavior implemented by gold.
https://llvm.org/bugs/show_bug.cgi?id=30541
Differential Revision: https://reviews.llvm.org/D24987
llvm-svn: 282560
assignment and compound-assignment operators before the left-hand side. (Even
if it's an overloaded operator.)
This completes the implementation of P0145R3 + P0400R0 for all targets except
Windows, where the evaluation order guarantees for <<, >>, and ->* are
unimplementable as the ABI requires the function arguments are evaluated from
right to left (because parameter destructors are run from left to right in the
callee).
llvm-svn: 282556
This patch fixes a regression introduced in r262697 that changed the way the
coverage regions for switches are constructed. The PGO instrumentation counter
for a switch statement refers to the counter at the exit of the switch.
Therefore, the coverage region for the switch statement should cover the code
that comes after the switch, and not the switch statement itself.
rdar://28480997
Differential Revision: https://reviews.llvm.org/D24981
llvm-svn: 282554
other load commands that use the MachO::dylinker_command type
but not used in llvm libObject code but used in llvm tool code.
This includes LC_ID_DYLINKER, LC_LOAD_DYLINKER
and LC_DYLD_ENVIRONMENT load commands.
llvm-svn: 282553