Commit Graph

111938 Commits

Author SHA1 Message Date
David Blaikie bd0c88078a Remove some unneeded #includes to fix layering
llvm-svn: 328838
2018-03-29 22:31:36 +00:00
Craig Topper ee3c19fd7f [X86] Add ReadAfterLds to some 3 src instructions
Sometimes the operand comes after the memory operand so we need 5 ReadDefaults first.

I suspect we also need to do something for the mask operand for masked avx512 instructions? I'm not sure if the mask should be ReadAfterLd or not since it can mask faults. If it shouldn't be ReadAfterLd then we're probably wrong for zero masking instructions already.

Differential Revision: https://reviews.llvm.org/D44726

llvm-svn: 328834
2018-03-29 22:03:05 +00:00
Matt Arsenault efd1b30436 AMDGPU: Fix build warning in release
llvm-svn: 328832
2018-03-29 21:44:44 +00:00
Matt Arsenault 03ae399d50 AMDGPU: Support realigning stack
While the stack access instructions don't care about
alignment > 4, some transformations on the pointer calculation
do make assumptions based on knowing the low bits of a pointer
are 0. If a stack object ends up being accessed through its
absolute address (relative to the kernel scratch wave offset),
the addressing expression may depend on the stack frame being
properly aligned. This was breaking in a testcase due to the
add->or combine.

I think some of the SP/FP handling logic is still backwards,
and overly simplistic to support all of the stack features.
Code which tries to modify the SP with inline asm for example
or variable sized objects will probably require redoing this.

llvm-svn: 328831
2018-03-29 21:30:06 +00:00
Evgeniy Stepanov 50635dab26 Add msan custom mapping options.
Similarly to https://reviews.llvm.org/D18865 this adds options to provide custom mapping for msan.
As discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-February/121339.html

Patch by vit9696(at)avp.su.

Differential Revision: https://reviews.llvm.org/D44926

llvm-svn: 328830
2018-03-29 21:18:17 +00:00
Craig Topper 3f2dbec652 [X86] Remove ReadAfterLd from BMI and TBM instructions that don't have a register operand in their memory form
The memory form of these instructions only read an input from memory. They don't have any register operands.

Differential Revision: https://reviews.llvm.org/D44836

llvm-svn: 328828
2018-03-29 21:03:53 +00:00
Craig Topper 89310f56c8 [X86] Correct the placement of ReadAfterLd in BEXTR and BZHI. Add dedicated SchedRW for BEXTR/BZHI.
These instructions have the memory operand before the register operand. So we need to put ReadDefault for all the load ops first. Then the ReadAfterLd

Differential Revision: https://reviews.llvm.org/D44838

llvm-svn: 328823
2018-03-29 20:41:39 +00:00
Philip Reames 5c14ed89f6 [NFC][LICM] Rearrange checks to have the cheap bail out first
llvm-svn: 328822
2018-03-29 20:32:15 +00:00
Matt Arsenault ffb132e74b AMDGPU: Increase default stack alignment
8 and 16-byte values are common, so increase the default
alignment to avoid realigning the stack in most functions.

llvm-svn: 328821
2018-03-29 20:22:04 +00:00
Matt Arsenault 6c041a3cab AMDGPU: Fix selection error on constant loads with < 4 byte alignment
llvm-svn: 328818
2018-03-29 19:59:28 +00:00
Philip Reames e4b728e82b Fix an accidental circular dependence
llvm-svn: 328816
2018-03-29 19:22:12 +00:00
Mandeep Singh Grang 10d8b85570 [Mips] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: sdardis, RKSimon, dsanders, atanasyan

Reviewed By: atanasyan

Subscribers: atanasyan, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D44869

llvm-svn: 328815
2018-03-29 19:05:26 +00:00
Zachary Turner 3203e27473 [MSF] Default to FPM2, and always mark FPM pages allocated.
There are two FPMs in an MSF file, the idea being that for
incremental updates you can write to the alternate one and then
atomically swap them on commit.  LLVM defaulted to using FPM1
on the first commit, but this differs from Microsoft's behavior
which is to default to using FPM2 on the first commit.  To
eliminate some byte-level file differences, this patch changes
LLVM's default to also be FPM2.

Additionally, LLVM was trying to be "smart" about marking FPM
pages allocated.  In addition to marking every page belonging
to the alternate FPM as unallocated, LLVM also marked pages at
the end of the main FPM which were not needed as unallocated.

In order to match the behavior of Microsoft-generated PDBs, we
now always mark every FPM block as allocated, regardless of
whether it is in the main FPM or the alt FPM, and regardless of
whether or not it describes blocks which are actually in the file.

This has the side benefit of simplifying our code.

llvm-svn: 328812
2018-03-29 18:34:15 +00:00
Craig Topper 2fa1436206 [IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer.
Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it.

The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly.

Differential Revision: https://reviews.llvm.org/D45017

llvm-svn: 328806
2018-03-29 17:21:10 +00:00
Paul Robinson b271f31d8d Reapply "[DWARFv5] Emit file 0 to the line table."
DWARF v5 specifies that the root file (also given in the DW_AT_name
attribute of the compilation unit DIE) should be emitted explicitly to
the line table's list of files.  This makes the line table more
independent of the .debug_info section.
We emit the new syntax only for DWARF v5 and later.

Fixes the bug found by asan. Also XFAIL the new test for Darwin, which
is stuck on DWARF v2, and fix up other tests so they stop failing on
Windows.  Last but not least, don't break "clang -g" of an assembler
file that has .file directives in it.

Differential Revision: https://reviews.llvm.org/D44054

llvm-svn: 328805
2018-03-29 17:16:41 +00:00
Haicheng Wu c7cc87922e [JumpThreading] Don't select an edge that we know we can't thread
In r312664 (D36404), JumpThreading stopped threading edges into
loop headers. Unfortunately, I observed a significant performance
regression as a result of this change. Upon further investigation,
the problematic pattern looked something like this (after
many high level optimizations):

while (true) {
    bool cond = ...;
    if (!cond) {
        <body>
    }
    if (cond)
        break;
}

Now, naturally we want jump threading to essentially eliminate the
second if check and hook up the edges appropriately. However, the
above mentioned change, prevented it from doing this because it would
have to thread an edge into the loop header.

Upon further investigation, what is happening is that since both branches
are threadable, JumpThreading picks one of them at arbitrarily. In my
case, because of the way that the IR ended up, it tended to pick
the one to the loop header, bailing out immediately after. However,
if it had picked the one to the exit block, everything would have
worked out fine (because the only remaining branch would then be folded,
not thraded which is acceptable).

Thus, to fix this problem, we can simply eliminate loop headers from
consideration as possible threading targets earlier, to make sure that
if there are multiple eligible branches, we can still thread one of
the ones that don't target a loop header.

Patch by Keno Fischer!
Differential Revision: https://reviews.llvm.org/D42260

llvm-svn: 328798
2018-03-29 16:01:26 +00:00
Pavel Labath ea0f841c3b .debug_names: Correctly align the AugmentationStringSize field
We should align the value of the field, not the overall section offset.

This distinction matters if one of the debug_names contributions is not
of size which is a multiple of four. The dwarf producers may choose to
emit rounded contributions, but they are not required to do so. In the
latter case, without this patch we would corrupt the parsing state, as
we would adjust the offset even if subsequent contributions contained
correctly rounded augmentation strings.

llvm-svn: 328796
2018-03-29 15:12:45 +00:00
Krzysztof Parzyszek dc7a557e6a [Hexagon] Add support to handle bit-reverse load intrinsics
Patch by Sumanth Gundapaneni.

llvm-svn: 328774
2018-03-29 13:52:46 +00:00
Pavel Labath 2d1fc4375f .debug_names: Parse DW_IDX_die_offset as a reference
Before this patch we were parsing the attributes as section offsets, as
that is what apple_names is doing. However, this is not correct as DWARF
v5 specifies that this attribute should use the Reference form class.

This also updates all the testcases (except the ones that deliberately
pass a different form) to use the correct form class.

llvm-svn: 328773
2018-03-29 13:47:57 +00:00
Simon Pilgrim 71c5f3fffd [X86][SSE] Don't bother re-adding combined target shuffles to the work list
We are re-adding all the bitcasts, constant masks and target shuffles to the work list for no apparent gain.

Found while investigating adding SimplifyDemandedVectorElts to target shuffles.

Differential Revision: https://reviews.llvm.org/D44942

llvm-svn: 328771
2018-03-29 11:18:41 +00:00
Simon Dardis 32a27fc77a [Mips] Remove dead code
I believe the role of ehDataReg has been replaced by MipsABIInfo::GetEhDataReg, thus removing the dead code.

Patch By: Wei-Ren Chen.

Reviewers: ehostunreach, sdardis

Differential Revision: https://reviews.llvm.org/D44867

llvm-svn: 328767
2018-03-29 09:21:20 +00:00
David Green b0aa36f9c2 [LoopRotate] Restructuring LoopRotation.cpp to create Loop Rotation Pass with Loop Rotation Utility Interface
The existing LoopRotation.cpp is implemented as one of loop passes instead of
being a utility. The user cannot easily perform the loop rotation selectively
(or on demand) under different optimization level. For example, the loop
rotation is needed as part of the logic to convert a loop into a loop with
bottom test for a transformation. If the loop rotation is simply added as a
loop pass before the transformation, the pass is skipped if it is compiled at
–O0 or if it is explicitly disabled by the user, causing the compiler to
generate incorrect code. Furthermore, as a loop pass it will rotate all loops
instead of just the relevant loops.

We provide a utility interface for the loop rotation so that the loop rotation
can be called on demand. The changeset is as follows:

- Create a new file lib/Transforms/Utils/LoopRotationUtils.cpp and move the main
  implementation of class LoopRotate into this file.
- Create a new file llvm/include/Transform/Utils/LoopRotationUtils.h with the
  interface LoopRotation(...).
- Original LoopRotation.cpp is changed to use the utility function LoopRotation
  in LoopRotationUtils.cpp. This is done in the same way community did for
  mem-to-reg implementation.

Patch by Jin Lin!

Differential Revision: https://reviews.llvm.org/D44595

llvm-svn: 328766
2018-03-29 08:48:15 +00:00
Benjamin Kramer 6b995a4a7e [Transforms] Make sure to include the c binding header when defining c binding functions
Otherwise the definitions can't see the extern C declarations and get
name mangled, making it impossible for users to call them. This breaks
the Go bindings.

llvm-svn: 328765
2018-03-29 07:56:53 +00:00
Max Kazantsev 18f93894db [NFC] Fix meaningless assert in SCEV
llvm-svn: 328764
2018-03-29 07:54:59 +00:00
Craig Topper a21758fa2c [X86] Don't pass getRegisterName from the InstPrinters into EmitAnyX86InstComments. Just always use the function from the ATTPrinter. NFC
The IntelPrinter and the ATTPrinter produce the same strings for the same input. We already use the ATTPrinter explicitly in several other places.

llvm-svn: 328762
2018-03-29 04:14:04 +00:00
Robert Widmann 6775f52fe0 [LLVM-C] Finish exception instruction bindings
Summary:
Add support for cleanupret, catchret, catchpad, cleanuppad and catchswitch and their associated accessors.

Test is modified from SimplifyCFG because it contains many diverse usages of these instructions.

Reviewers: whitequark, deadalnix, echristo

Reviewed By: echristo

Subscribers: llvm-commits, harlanhaskins

Differential Revision: https://reviews.llvm.org/D44496

llvm-svn: 328759
2018-03-29 03:43:15 +00:00
Craig Topper 7456af88f4 [X86] Rename RIi64_NOREX tblgen class to just Ii64. Make RIi64 inherit from it. NFC
This feels more consistent with the other classes. We don't need to say _NOREX if we didn't start it with an R in the first place.

llvm-svn: 328757
2018-03-29 03:14:57 +00:00
Craig Topper 7441ffff84 [X86] Cleanup inheritance of the X86InstrFormats.td classes. NFC
EVEX shouldn't inherit from VEX and EVEX_4V shouldn't inherit from VEX_4V.

llvm-svn: 328756
2018-03-29 03:14:56 +00:00
George Burgess IV af0b06f4fd [MemorySSA] Turn an assert into a condition
Eli pointed out that variadic functions are totally a thing, so this
assert is incorrect.

No test-case is provided, since the only way this assert fires is if a
specific DenseMap falls back to doing `isEqual` checks, and that seems
fairly brittle (and requires a pyramid of growing
`call void (i8, ...) @varargs(i8 0)`).

llvm-svn: 328755
2018-03-29 03:12:03 +00:00
George Burgess IV 3588fd4865 [MemorySSA] Consider callsite args for hashing and equality.
We use a `DenseMap<MemoryLocOrCall, MemlocStackInfo>` to keep track of
prior work when optimizing uses in MemorySSA. Because we weren't
accounting for callsite arguments in either the hash code or equality
tests for `MemoryLocOrCall`s, we optimized uses too aggressively in
some rare cases.

Fix by Daniel Berlin.

Should fix PR36883.

llvm-svn: 328748
2018-03-29 00:54:39 +00:00
David Blaikie b3f471a4bd Remove some unused includes to fix layering.
llvm-svn: 328745
2018-03-29 00:29:45 +00:00
David Blaikie 8ad9a97310 Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header dependency
Thanks to echristo for the pointers on direction.

llvm-svn: 328737
2018-03-28 22:28:50 +00:00
Craig Topper aac23d7881 [X86][SkylakeServer] Remove checks for 'k', 'z', '_Int' and 'b' from scheduler regexs.
Most of these were optional matches at the end of the strings, but since the strings themselves are prefix matches by default you don't need to check for something optional at the end.

I've left the 'b' on memory instructions where it means 'broadcast' because I'm not sure those really have the same load latency and we may need to split them explicitly in the future.

llvm-svn: 328730
2018-03-28 20:40:24 +00:00
Jun Bum Lim f90fe701ef [PostRAMachineSink] preserve CFG
Summary: Mark CFG is preserved  since this pass do not make any change in CFG.

Reviewers: sebpop, mzolotukhin, mcrosier

Reviewed By: mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44845

llvm-svn: 328727
2018-03-28 19:56:26 +00:00
Krzysztof Parzyszek 440ba3ae5c [Hexagon] Add support for "new" circular buffer intrinsics
These instructions have been around for a long time, but we
haven't supported intrinsics for them. The "new" versions use
the CSx register for the start of the buffer instead of the K
field in the Mx register.

We need to use pseudo instructions for these instructions until
after register allocation. The problem is that these instructions
allocate a M0/CS0 or M1/CS1 pair. But, we can't generate code for
the CSx set-up until after register allocation when the Mx
register has been fixed for the instruction.

There is a related clang patch.

Patch by Brendon Cahoon.

llvm-svn: 328724
2018-03-28 19:38:29 +00:00
David Blaikie eb8cc04ea2 Oops - moved slightly too many things from Scalar to Utils. Move LoopSimplifyCFG things back
llvm-svn: 328720
2018-03-28 18:03:25 +00:00
Jessica Paquette 4aa14dbcc2 [MachineOutliner] Simplify call outlining + require valid callee save info for call outlining
This commit simplifies the call outlining logic by removing references to the
Function associated with the callee. To do this, it requires that valid
callee save info is available to the outliner.

llvm-svn: 328719
2018-03-28 17:52:31 +00:00
David Blaikie a373d18eb7 Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h
Fixes layering - Transforms/Utils shouldn't depend on including a Scalar
or IPO header, because Scalar and IPO depend on Utils.

llvm-svn: 328717
2018-03-28 17:44:36 +00:00
Peter Collingbourne d579c31d68 [llvm-ar] Support multiple dashed options
This allows syntax like:
$ llvm-ar -c -r -u file.a file.o

This is in addition to the other formats that are already supported:
$ llvm-ar cru file.a file.o
$ llvm-ar -cru file.a file.o

Patch by Tom Anderson!

Differential Revision: https://reviews.llvm.org/D44452

llvm-svn: 328716
2018-03-28 17:21:14 +00:00
Dmitry Preobrazhensky 622bde8bc7 [AMDGPU][MC] Added ds_add_src2_f32
See bug 36833: https://bugs.llvm.org/show_bug.cgi?id=36833

Differential Revision: https://reviews.llvm.org/D44779

Reviewers: arsenm, artem.tamazov, timcorringham
llvm-svn: 328713
2018-03-28 16:21:56 +00:00
Dmitry Preobrazhensky 2456ac696a [AMDGPU][MC] Added PCK variants of image load/store instructions
See bug 36834: https://bugs.llvm.org/show_bug.cgi?id=36834

Differential Revision: https://reviews.llvm.org/D44795

Reviewers: artem.tamazov, arsenm, timcorringham, nhaehnle
llvm-svn: 328710
2018-03-28 15:44:16 +00:00
Dmitry Preobrazhensky a917e88585 [AMDGPU][MC][GFX9] Added buffer_*_format_d16_hi_x
See bug 36835: https://bugs.llvm.org/show_bug.cgi?id=36835

Differential Revision: https://reviews.llvm.org/D44825

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 328707
2018-03-28 14:53:13 +00:00
Dmitry Preobrazhensky dd2b929ffb [AMDGPU][MC][GFX9] Added s_scratch* instructions
See bug 36836: https://bugs.llvm.org/show_bug.cgi?id=36836

Differential Revision: https://reviews.llvm.org/D44832

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 328704
2018-03-28 14:08:03 +00:00
Simon Pilgrim b1bc6cd96b [X86][Btver2] Moved JWriteFCmp/JWriteFCmpY classes next to each other. NFCI
Renamed JWriteFPAY22 to JWriteFCmpY - we've tended to avoid latency based names

llvm-svn: 328701
2018-03-28 13:53:21 +00:00
Alexander Potapenko 202f809437 Revert "Reapply "[DWARFv5] Emit file 0 to the line table.""
This reverts commit r328676.

Commit r328676 broke the -no-integrated-as flag necessary to build Linux kernel with Clang:

$ cat t.c
void foo() {}
$ clang -no-integrated-as   -c  t.c -g
/tmp/t-dcdec5.s: Assembler messages:
/tmp/t-dcdec5.s:8: Error: file number less than one
clang-7.0: error: assembler command failed with exit code 1 (use -v to see invocation)

llvm-svn: 328699
2018-03-28 12:36:46 +00:00
Andrea Di Biagio 5076b98fb9 [X86][BtVer2] Fix the number of micro opcodes for AES[ENC|DEC] and other YMM instructions.
Similar to r328694. The number of micro opcodes should be 2 for those
instructions.

This was found when testing AVX code for BtVer2 using llvm-mca.

llvm-svn: 328698
2018-03-28 12:12:04 +00:00
Alexander Potapenko 4e7ad0805e [MSan] Introduce ActualFnStart. NFC
This is a step towards the upcoming KMSAN implementation patch.
KMSAN is going to prepend a special basic block containing
tool-specific calls to each function. Because we still want to
instrument the original entry block, we'll need to store it in
ActualFnStart.

For MSan this will still be F.getEntryBlock(), whereas for KMSAN
it'll contain the second BB.

llvm-svn: 328697
2018-03-28 11:35:09 +00:00
Tim Renouf cdac172e2a Revert "[AMDGPU] For OS type AMDPAL, fixed scratch on compute shader"
This reverts commit 0daf86291d3aa04d3cc280cd0ef24abdb0174981.

It was causing an assert in test/CodeGen/AMDGPU/amdpal.ll only on a
release-with-asserts build. I will resubmit the change when I have fixed
that.

Change-Id: If270594eba27a7dc4076bdeab3fa8e6bfda3288a
llvm-svn: 328695
2018-03-28 11:21:07 +00:00
Andrea Di Biagio 010924e35c [X86][BtVer2] Fix the number of micro opcodes for a bunch of YMM instructions.
The Jaguar backend natively supports 128-bit data types. Operations on YMM
registers are split into two COPs (complex operations). Each COP consumes a slot
in the dispatch group, and in the reorder buffer.

The scheduling model for Jaguar should mark those instructions as `let
NumMicroOps = 2`.

This was found when testing AVX code for BtVer2 using llvm-mca.

llvm-svn: 328694
2018-03-28 10:49:33 +00:00
Alexander Potapenko e1d5877847 [MSan] Add an isStore argument to getShadowOriginPtr(). NFC
This is a step towards the upcoming KMSAN implementation patch.
The isStore argument is to be used by getShadowOriginPtrKernel(),
it is ignored by getShadowOriginPtrUserspace().

Depending on whether a memory access is a load or a store, KMSAN
instruments it with different functions, __msan_metadata_ptr_for_load_X()
and __msan_metadata_ptr_for_store_X().

Those functions may return different values for a single address,
which is necessary in the case the runtime library decides to ignore
particular accesses.

llvm-svn: 328692
2018-03-28 10:17:17 +00:00