Commit Graph

23886 Commits

Author SHA1 Message Date
Sander de Smalen dc00becd1b [DebugInfo] Emit DWARF reference for DIVariable 'count' in DISubrange
Summary:
This patch implements the codegen of DWARF debug info for non-constant
'count' fields for DISubrange.

This is patch [2/3] in a series to extend LLVM's DISubrange Metadata
node to support debugging of C99 variable length arrays and vectors with
runtime length like the Scalable Vector Extension for AArch64. It is
also a first step towards representing more complex cases like arrays
in Fortran.

Reviewers: echristo, pcc, aprantl, dexonsmith, clayborg, kristof.beyls, dblaikie

Reviewed By: aprantl

Subscribers: fhahn, aemerson, rengolin, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D41696

llvm-svn: 323323
2018-01-24 13:35:54 +00:00
Sander de Smalen fdf40917d9 [Metadata] Extend 'count' field of DISubrange to take a metadata node
Summary:
This patch extends the DISubrange 'count' field to take either a
(signed) constant integer value or a reference to a DILocalVariable
or DIGlobalVariable.

This is patch [1/3] in a series to extend LLVM's DISubrange Metadata
node to support debugging of C99 variable length arrays and vectors with
runtime length like the Scalable Vector Extension for AArch64. It is
also a first step towards representing more complex cases like arrays
in Fortran.

Reviewers: echristo, pcc, aprantl, dexonsmith, clayborg, kristof.beyls, dblaikie

Reviewed By: aprantl

Subscribers: rnk, probinson, fhahn, aemerson, rengolin, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D41695

llvm-svn: 323313
2018-01-24 09:56:07 +00:00
Sven van Haastregt e8404780c3 [DAGCombiner] Bail out if vector size is not a multiple
For the included test case, the DAG transformation
  concat_vectors(scalar, undef) -> scalar_to_vector(sclr)
would attempt to create a v2i32 vector for a v9i8
concat_vector.  Bail out to avoid creating a bitcast with
mismatching sizes later on.

Differential Revision: https://reviews.llvm.org/D42379

llvm-svn: 323312
2018-01-24 09:53:47 +00:00
Martin Storsjo e8248f2e10 [GlobalMerge] Don't merge dllexport globals
Merging such globals loses the dllexport attribute. Add a test
to check that normal globals still are merged.

Differential Revision: https://reviews.llvm.org/D42127

llvm-svn: 323307
2018-01-24 06:40:04 +00:00
Aditya Nandakumar f2aa2af24e [GISel]: Remove redundant copies at the end of ISel
https://reviews.llvm.org/D42402

A lot of these copies are useless (copies b/w VRegs having the same
regclass) and should be cleaned up.

llvm-svn: 323291
2018-01-24 01:35:26 +00:00
Evgeniy Stepanov d5a6fdbe95 [safestack] Inline safestack pointer access when possible.
Summary:
This adds an -mllvm flag that forces the use of a runtime function call to
get the unsafe stack pointer, the same that is currently used on non-x86, non-aarch64 android.
The call may be inlined.

Reviewers: pcc

Subscribers: aemerson, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37405

llvm-svn: 323259
2018-01-23 21:27:07 +00:00
Yaxun Liu 8b7454a8dd CodeGen: Fix assertion in ScheduleDAGMILive::scheduleMI due to llvm.dbg.value
Fix a bug in ScheduleDAGMILive::scheduleMI which causes BotRPTracker not tracking CurrentBottom in some rare cases involving llvm.dbg.value.

This issues causes amdgcn target to assert when compiling some user codes with -g.

Differential Revision: https://reviews.llvm.org/D42394

llvm-svn: 323214
2018-01-23 16:04:53 +00:00
Serguei Katkov 17e5794f11 [CGP] Fix the GV handling in complex addressing mode
If in complex addressing mode the difference is in GV then
base reg should not be installed because we plan to use
base reg as a merge point of different GVs.

This is a fix for PR35980.

Reviewers: reames, john.brawn, santosh
Reviewed By: john.brawn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42230

llvm-svn: 323192
2018-01-23 12:07:49 +00:00
Chandler Carruth c58f2166ab Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre..
Summary:
First, we need to explain the core of the vulnerability. Note that this
is a very incomplete description, please see the Project Zero blog post
for details:
https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html

The basis for branch target injection is to direct speculative execution
of the processor to some "gadget" of executable code by poisoning the
prediction of indirect branches with the address of that gadget. The
gadget in turn contains an operation that provides a side channel for
reading data. Most commonly, this will look like a load of secret data
followed by a branch on the loaded value and then a load of some
predictable cache line. The attacker then uses timing of the processors
cache to determine which direction the branch took *in the speculative
execution*, and in turn what one bit of the loaded value was. Due to the
nature of these timing side channels and the branch predictor on Intel
processors, this allows an attacker to leak data only accessible to
a privileged domain (like the kernel) back into an unprivileged domain.

The goal is simple: avoid generating code which contains an indirect
branch that could have its prediction poisoned by an attacker. In many
cases, the compiler can simply use directed conditional branches and
a small search tree. LLVM already has support for lowering switches in
this way and the first step of this patch is to disable jump-table
lowering of switches and introduce a pass to rewrite explicit indirectbr
sequences into a switch over integers.

However, there is no fully general alternative to indirect calls. We
introduce a new construct we call a "retpoline" to implement indirect
calls in a non-speculatable way. It can be thought of loosely as
a trampoline for indirect calls which uses the RET instruction on x86.
Further, we arrange for a specific call->ret sequence which ensures the
processor predicts the return to go to a controlled, known location. The
retpoline then "smashes" the return address pushed onto the stack by the
call with the desired target of the original indirect call. The result
is a predicted return to the next instruction after a call (which can be
used to trap speculative execution within an infinite loop) and an
actual indirect branch to an arbitrary address.

On 64-bit x86 ABIs, this is especially easily done in the compiler by
using a guaranteed scratch register to pass the target into this device.
For 32-bit ABIs there isn't a guaranteed scratch register and so several
different retpoline variants are introduced to use a scratch register if
one is available in the calling convention and to otherwise use direct
stack push/pop sequences to pass the target address.

This "retpoline" mitigation is fully described in the following blog
post: https://support.google.com/faqs/answer/7625886

We also support a target feature that disables emission of the retpoline
thunk by the compiler to allow for custom thunks if users want them.
These are particularly useful in environments like kernels that
routinely do hot-patching on boot and want to hot-patch their thunk to
different code sequences. They can write this custom thunk and use
`-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this
case, on x86-64 thu thunk names must be:
```
  __llvm_external_retpoline_r11
```
or on 32-bit:
```
  __llvm_external_retpoline_eax
  __llvm_external_retpoline_ecx
  __llvm_external_retpoline_edx
  __llvm_external_retpoline_push
```
And the target of the retpoline is passed in the named register, or in
the case of the `push` suffix on the top of the stack via a `pushl`
instruction.

There is one other important source of indirect branches in x86 ELF
binaries: the PLT. These patches also include support for LLD to
generate PLT entries that perform a retpoline-style indirection.

The only other indirect branches remaining that we are aware of are from
precompiled runtimes (such as crt0.o and similar). The ones we have
found are not really attackable, and so we have not focused on them
here, but eventually these runtimes should also be replicated for
retpoline-ed configurations for completeness.

For kernels or other freestanding or fully static executables, the
compiler switch `-mretpoline` is sufficient to fully mitigate this
particular attack. For dynamic executables, you must compile *all*
libraries with `-mretpoline` and additionally link the dynamic
executable and all shared libraries with LLD and pass `-z retpolineplt`
(or use similar functionality from some other linker). We strongly
recommend also using `-z now` as non-lazy binding allows the
retpoline-mitigated PLT to be substantially smaller.

When manually apply similar transformations to `-mretpoline` to the
Linux kernel we observed very small performance hits to applications
running typical workloads, and relatively minor hits (approximately 2%)
even for extremely syscall-heavy applications. This is largely due to
the small number of indirect branches that occur in performance
sensitive paths of the kernel.

When using these patches on statically linked applications, especially
C++ applications, you should expect to see a much more dramatic
performance hit. For microbenchmarks that are switch, indirect-, or
virtual-call heavy we have seen overheads ranging from 10% to 50%.

However, real-world workloads exhibit substantially lower performance
impact. Notably, techniques such as PGO and ThinLTO dramatically reduce
the impact of hot indirect calls (by speculatively promoting them to
direct calls) and allow optimized search trees to be used to lower
switches. If you need to deploy these techniques in C++ applications, we
*strongly* recommend that you ensure all hot call targets are statically
linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well
tuned servers using all of these techniques saw 5% - 10% overhead from
the use of retpoline.

We will add detailed documentation covering these components in
subsequent patches, but wanted to make the core functionality available
as soon as possible. Happy for more code review, but we'd really like to
get these patches landed and backported ASAP for obvious reasons. We're
planning to backport this to both 6.0 and 5.0 release streams and get
a 5.0 release with just this cherry picked ASAP for distros and vendors.

This patch is the work of a number of people over the past month: Eric, Reid,
Rui, and myself. I'm mailing it out as a single commit due to the time
sensitive nature of landing this and the need to backport it. Huge thanks to
everyone who helped out here, and everyone at Intel who helped out in
discussions about how to craft this. Also, credit goes to Paul Turner (at
Google, but not an LLVM contributor) for much of the underlying retpoline
design.

Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer

Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D41723

llvm-svn: 323155
2018-01-22 22:05:25 +00:00
Marina Yatsina e4d63a499d Fixing warnings caused by commit 323095
Change-Id: I4e1f81db2f5382a820f4016c23b243e4d5aebf51
llvm-svn: 323114
2018-01-22 13:24:10 +00:00
Marina Yatsina 0bf841ac2a Separate LoopTraversal, ReachingDefAnalysis and BreakFalseDeps into their own files.
This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40331
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40333

Change-Id: Ie5f8eb34d98cfdfae23a3072eb69b5794f0e2d56
llvm-svn: 323095
2018-01-22 10:06:50 +00:00
Marina Yatsina 3d8efa4f0c Rename ExecutionDepsFix files to ExecutionDomainFix
This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40331
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40332

Change-Id: I6a048cca7fdafbfc42fb1bac94343e483befded8
llvm-svn: 323094
2018-01-22 10:06:33 +00:00
Marina Yatsina c757c0b6ba ExecutionDepsFix refactoring:
- clang-format

This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40331

Change-Id: I131b126af13bc743bc5d69d83699e52b9b720979
llvm-svn: 323093
2018-01-22 10:06:18 +00:00
Marina Yatsina 30234e866a ExecutionDepsFix refactoring:
- Moving comments to class definition in header file
- Changing comments to doxygen style
- Rephrase loop traversal explaining comment

This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40331

Change-Id: I9a12618db5b66128611fa71b54a233414f6012ac
llvm-svn: 323092
2018-01-22 10:06:10 +00:00
Marina Yatsina 273d35db4d ExecutionDepsFix refactoring:
- Removing LiveRegs

This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40331

Change-Id: I8ab56d99951a6d6981542f68d94c1f624f3c9fbf
llvm-svn: 323091
2018-01-22 10:06:01 +00:00
Marina Yatsina 6daa2d25e8 ExecutionDepsFix refactoring:
- Changing LiveRegs to be a vector

This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40331

Change-Id: I9cdd364bd7bf2a0bf61ea41a48d4bd310ec3bce4
llvm-svn: 323090
2018-01-22 10:05:53 +00:00
Marina Yatsina 0f9cda52f5 ExecutionDepsFix refactoring:
- Changing DenseMap<MBB*, LiveReg*> to SmallVector<LiveReg*>
- Now the MBB number will be the index of LiveReg in the vector.
- Adding asserts

This patch is NFC.

This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40331

Change-Id: If4a3f141693d0361ddb292432337dbb63a1e69ee
llvm-svn: 323089
2018-01-22 10:05:45 +00:00
Marina Yatsina 6b34e63ebd ExecutionDepsFix refactoring:
- Remove unneeded includes and unneeded members
- Use range iterators
- Variable renaming, typedefs, extracting constants
- Removing {} from one line ifs

This patch is NFC.

This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40331

Change-Id: Ib59060ab3fa5bee3bf2ca2045c24e572635ee7f6
llvm-svn: 323088
2018-01-22 10:05:37 +00:00
Marina Yatsina 6fc2aaae8d Separate ExecutionDepsFix into 4 parts:
1. ReachingDefsAnalysis - Allows to identify for each instruction what is the “closest” reaching def of a certain register. Used by BreakFalseDeps (for clearance calculation) and ExecutionDomainFix (for arbitrating conflicting domains).
2. ExecutionDomainFix - Changes the variant of the instructions in order to minimize domain crossings.
3. BreakFalseDeps - Breaks false dependencies.
4. LoopTraversal - Creatws a traversal order of the basic blocks that is optimal for loops (introduced in revision L293571). Both ExecutionDomainFix and ReachingDefsAnalysis use this to determine the order they will traverse the basic blocks.

This also included the following changes to ExcecutionDepsFix original logic:
1. BreakFalseDeps and ReachingDefsAnalysis logic no longer restricted by a register class.
2. ReachingDefsAnalysis tracks liveness of reg units instead of reg indices into a given reg class.

Additional changes in affected files:
1. X86 and ARM targets now inherit from ExecutionDomainFix instead of ExecutionDepsFix. BreakFalseDeps also was added to the passes they activate.
2. Comments and references to ExecutionDepsFix replaced with ExecutionDomainFix and BreakFalseDeps, as appropriate.

Additional refactoring changes will follow.

This commit is (almost) NFC.
The only functional change is that now BreakFalseDeps will break dependency for all register classes.
Since no additional instructions were added to the list of instructions that have false dependencies, there is no actual change yet.
In a future commit several instructions (and tests) will be added.

This is the first of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40331
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40330

Change-Id: Icaeb75e014eff96a8f721377783f9a3e6c679275
llvm-svn: 323087
2018-01-22 10:05:23 +00:00
Jonas Paulsson 7ad28863fb [SelectionDAG] Fix codegen of vector stores with non byte-sized elements.
This was completely broken, but hopefully fixed by this patch.

In cases where it is needed, a vector with non byte-sized elements is stored
by extracting, zero-extending, shift:ing and or:ing the elements into an
integer of the same width as the vector, which is then stored.

Review: Eli Friedman, Ulrich Weigand
https://reviews.llvm.org/D42100#inline-369520
https://bugs.llvm.org/show_bug.cgi?id=35520

llvm-svn: 323042
2018-01-20 16:05:10 +00:00
Saleem Abdulrasool 99f479abcf CodeGen: handle llvm.used properly for COFF
`llvm.used` contains a list of pointers to named values which the
compiler, assembler, and linker are required to treat as if there is a
reference that they cannot see.  Ensure that the symbols are preserved
by adding an explicit `-include` reference to the linker command.

llvm-svn: 323017
2018-01-20 00:28:02 +00:00
Jessica Paquette a499c3c29d Add optional DICompileUnit to DIBuilder + make outliner debug info use it
Previously, the DIBuilder didn't expose functionality to set its compile unit
in any other way than calling createCompileUnit. This meant that the outliner,
which creates new functions, had to create a new compile unit for its debug
info.

This commit adds an optional parameter in the DIBuilder's constructor which
lets you set its CU at construction.

It also changes the MachineOutliner so that it keeps track of the DISubprograms
for each outlined sequence. If debugging information is requested, then it
uses one of the outlined sequence's DISubprograms to grab a CU. It then uses
that CU to construct the DISubprogram for the new outlined function.

The test has also been updated to reflect this change.

See https://reviews.llvm.org/D42254 for more information. Also see the e-mail
discussion on D42254 in llvm-commits for more context.

llvm-svn: 322992
2018-01-19 21:21:49 +00:00
Ulrich Weigand 5605be9e50 [SelectionDAG] Teach computeKnownBits about ATOMIC_CMP_SWAP_WITH_SUCCESS boolean return value
The second return value of ATOMIC_CMP_SWAP_WITH_SUCCESS is known to be a
boolean, and should therefore be treated by computeKnownBits just like
the second return values of SMULO / UMULO.

Differential Revision: https://reviews.llvm.org/D42067

llvm-svn: 322985
2018-01-19 20:47:14 +00:00
Daniel Neilson 1e68724d24 Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)
Summary:
 This is a resurrection of work first proposed and discussed in Aug 2015:
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
and initially landed (but then backed out) in Nov 2015:
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

 The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.

 This change is the first in a series that allows source and dest to each
have their own alignments by using the alignment attribute on their arguments.

 In this change we:
1) Remove the alignment argument.
2) Add alignment attributes to the source & dest arguments. We, temporarily,
   require that the alignments for source & dest be equal.

 For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)

 Downstream users may have to update their lit tests that check for
@llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script
may help with updating the majority of your tests, but it does not catch all possible
patterns so some manual checking and updating will be required.

s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g

 The remaining changes in the series will:
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
   source and dest alignments.
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
        and those that use use MemIntrinsicInst::[get|set]Alignment() to use
        getDestAlignment() and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
        MemIntrinsicInst::[get|set]Alignment() methods.

Reviewers: pete, hfinkel, lhames, reames, bollu

Reviewed By: reames

Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits

Differential Revision: https://reviews.llvm.org/D41675

llvm-svn: 322965
2018-01-19 17:13:12 +00:00
Francis Visoiu Mistrih 548add992e [CodeGen] Unify printing format of debug-location in both MIR and -debug
Use "debug-location" instead of "; dbg:" in MI::print.

llvm-svn: 322936
2018-01-19 11:44:42 +00:00
Matthias Braun 4a7c8e7aa2 Split MachineLICM into EarlyMachineLICM and MachineLICM; NFC
This avoids playing games with pseudo pass IDs and avoids using an
unreliable MRI::isSSA() check to determine whether register allocation
has happened.

Note that this renames:
- MachineLICMID -> EarlyMachineLICM
- PostRAMachineLICMID -> MachineLICMID
to be consistent with the EarlyTailDuplicate/TailDuplicate naming.

llvm-svn: 322927
2018-01-19 06:46:10 +00:00
Matthias Braun 3ab9fcb98e Split TailDuplicatePass into pre- and post-RA variant; NFC
Split TailDuplicatePass into EarlyTailDuplicate and TailDuplicate. This
avoids playing games with fake pass IDs and using MRI::isSSA() to
determine pre-/post-RA state.

llvm-svn: 322926
2018-01-19 06:08:17 +00:00
Serguei Katkov 22bb1c0e17 Revert [CGP] Re-enable Select in complex addressing mode
One of buildbots failed. Revert for now till fix the issue.

llvm-svn: 322923
2018-01-19 04:52:39 +00:00
Matthias Braun 5c290dc206 AArch64: Fix emergency spillslot being out of reach for large callframes
Re-commit of r322200: The testcase shouldn't hit machineverifiers
anymore with r322917 in place.

Large callframes (calls with several hundreds or thousands or
parameters) could lead to situations in which the emergency spillslot is
out of range to be addressed relative to the stack pointer.
This commit forces the use of a frame pointer in the presence of large
callframes.

This commit does several things:
- Compute max callframe size at the end of instruction selection.
- Add mirFileLoaded target callback. Use it to compute the max callframe size
  after loading a .mir file when the size wasn't specified in the file.
- Let TargetFrameLowering::hasFP() return true if there exists a
  callframe > 255 bytes.
- Always place the emergency spillslot close to FP if we have a frame
  pointer.
- Note that `useFPForScavengingIndex()` would previously return false
  when a base pointer was available leading to the emergency spillslot
  getting allocated late (that's the whole effect of this callback).
  Which made no sense to me so I took this case out: Even though the
  emergency spillslot is technically not referenced by FP in this case
  we still want it allocated early.

Differential Revision: https://reviews.llvm.org/D40876

llvm-svn: 322919
2018-01-19 03:16:36 +00:00
Francis Visoiu Mistrih eb3f76fc83 [CodeGen][NFC] Rename IsVerbose to IsStandalone in Machine*::print
Committed r322867 too soon.

Differential Revision: https://reviews.llvm.org/D42239

llvm-svn: 322868
2018-01-18 18:05:15 +00:00
Francis Visoiu Mistrih 378b5f3de6 [CodeGen] Print RegClasses on MI in verbose mode
r322086 removed the trailing information describing reg classes for each
register.

This patch adds printing reg classes next to every register when
individual operands/instructions/basic blocks are printed. In the case
of dumping MIR or printing a full function, by default don't print it.

Differential Revision: https://reviews.llvm.org/D42239

llvm-svn: 322867
2018-01-18 17:59:06 +00:00
Sanjay Patel 0e690b05d1 [TargetLowering] add punctuation for readability; NFC
llvm-svn: 322855
2018-01-18 15:25:32 +00:00
Francis Visoiu Mistrih 586444e42b [CodeGen][NFC] Refactor MachineInstr::print
* Handle more cases where the MI is not attached yet
* Add similar asserts like in MIRPrinter::print

llvm-svn: 322848
2018-01-18 14:52:14 +00:00
Sam Parker 1f8c035d6b [SelectionDAG] Convert assert to condtion
Follow-up to r322120 which can cause assertions for AArch64 because
v1f64 and v1i64 are legal types.

Differential Revision: https://reviews.llvm.org/D42097

llvm-svn: 322823
2018-01-18 09:22:24 +00:00
Craig Topper 7f0d85ec1e [DAGCombiner] Add a DAG combine to turn a splat build_vector where the splat elemnt is a bitcast from a vector type into a concat_vector
For example, a build_vector of i64 bitcasted from v2i32 can be turned into a concat_vectors of the v2i32 vectors with a bitcast to a vXi64 type

Differential Revision: https://reviews.llvm.org/D42090

llvm-svn: 322811
2018-01-18 04:17:06 +00:00
Justin Bogner a9346e050f GlobalISel: Make MachineCSE runnable in the middle of the GlobalISel
Right now, it is not possible to run MachineCSE in the middle of the
GlobalISel pipeline. Being able to run generic optimizations between the
core passes of GlobalISel was one of the goals of the new ISel framework.
This is the first attempt to do it.

The problem is that MachineCSE pass assumes all register operands have a
register class, which, in GlobalISel context, won't be true until after the
InstructionSelect pass. The reason for this behaviour is that before
replacing one virtual register with another, MachineCSE pass (and most of
the other optimization machine passes) must check if the virtual registers'
constraints have a (sufficiently large) intersection, and constrain the
resulting register appropriately if such intersection exists.

GlobalISel extends the representation of such constraints from just a
register class to a triple (low-level type, register bank, register
class).

This commit adds MachineRegisterInfo::constrainRegAttrs method that extends
MachineRegisterInfo::constrainRegClass to such a triple.

The idea is that going forward we should use:

- RegisterBankInfo::constrainGenericRegister within GlobalISel's
  InstructionSelect pass
- MachineRegisterInfo::constrainRegClass within SelectionDAG ISel
- MachineRegisterInfo::constrainRegAttrs everywhere else regardless
  the target and instruction selector it uses.

Patch by Roman Tereshin. Thanks!

llvm-svn: 322805
2018-01-18 02:06:56 +00:00
Volkan Keles 4aa73a649a Fix the failure caused by r322773
Do not run GlobalISel if `-fast-isel=0 -global-isel=false`.

llvm-svn: 322800
2018-01-18 01:10:30 +00:00
Jessica Paquette 729e68693f [MachineOutliner] Add DISubprograms to outlined functions.
Before, it wasn't possible to get backtraces inside outlined functions. This
commit adds DISubprograms to the IR functions created by the outliner which
makes this possible. Also attached a test that ensures that the produced
debug information is correct. This is useful to users that want to debug
outlined code.

llvm-svn: 322789
2018-01-18 00:00:58 +00:00
Reid Kleckner 1aa9061c5f [CodeGen] Hoist common AsmPrinter code out of X86, ARM, and AArch64
Every known PE COFF target emits /EXPORT: linker flags into a .drective
section. The AsmPrinter should handle this.

While we're at it, use global_values() and emit each export flag with
its own .ascii directive. This should make the .s file output more
readable.

llvm-svn: 322788
2018-01-17 23:55:23 +00:00
Volkan Keles a79b0620a0 Add a TargetOption to enable/disable GlobalISel
Summary:
This patch adds a new target option in order to control GlobalISel.
This will allow the users to enable/disable GlobalISel prior to the
backend by calling `TargetMachine::setGlobalISel(bool Enable)`.

No test case as there is already a test to check GlobalISel
command line options.
See: CodeGen/AArch64/GlobalISel/gisel-commandline-option.ll.

Reviewers: qcolombet, aemerson, ab, dsanders

Reviewed By: qcolombet

Subscribers: rovka, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42137

llvm-svn: 322773
2018-01-17 22:34:21 +00:00
Benjamin Kramer 8b1986b5cb Add support for emitting libcalls for x86_fp80 -> fp128 and vice-versa
compiler_rt doesn't provide them (yet), but libgcc does. PR34076.

llvm-svn: 322772
2018-01-17 22:29:16 +00:00
Eli Friedman c60a23a6af [LegalizeDAG] Fix ATOMIC_CMP_SWAP_WITH_SUCCESS legalization.
The code wasn't zero-extending correctly, so the comparison could
spuriously fail.

Adds some AArch64 tests to cover this case.

Inspired by D41791.

Differential Revision: https://reviews.llvm.org/D41798

llvm-svn: 322767
2018-01-17 22:04:36 +00:00
Aditya Nandakumar 18b3f9d384 [GISel] Make constrainSelectedInstRegOperands() available to the legalizer. NFC
https://reviews.llvm.org/D42149

llvm-svn: 322743
2018-01-17 19:31:33 +00:00
Diana Picus 65ed364fac [ARM GlobalISel] Legalize G_FPEXT and G_FPTRUNC
Mark G_FPEXT and G_FPTRUNC as legal or libcall, depending on hardware
support, but only for conversions between float and double.

Also add the necessary boilerplate so that the LegalizerHelper can
introduce the required libcalls. This also works only for float and
double, but isn't too difficult to extend when the need arises.

llvm-svn: 322651
2018-01-17 13:34:10 +00:00
Hiroshi Inoue 8f976ba0bf [NFC] fix trivial typos in comments
"the the" -> "the"

llvm-svn: 322636
2018-01-17 12:29:38 +00:00
Pavel Labath 67530e478b Don't emit apple accelerator tables on non-darwin targets
Summary:
Currently -glldb turns on emission of apple tables on all targets, but
lldb is only really capable of consuming them on darwin. Furthermore,
making lldb consume these tables is not straight-forward because of the
differences in how the debug info is distributed on darwin vs. elf
targets.

The darwin debug model assumes that the debug info (along with
accelerator tables) will either remain in the .o files or it will be
linked into a dsym bundle by a linker that knows how to merge these
tables. In the elf world, all present linkers will simply concatenate
these accelerator tables into the shared object. Since the tables are
not self-terminating, this renders the tables unusable, as the debugger
cannot pry the individual tables apart anymore.

It might theoretically be possible to make the tables work with split
dwarf, as that is somewhat similar to the apple .o model, but
unfortunately right now the combination of -glldb and -gsplit-dwarf
produces broken object files.

Until these issues are resolved there is no point in emitting the apple
tables for these targets. At best, it wastes space; at worst, it breaks
compilation and prevents the user from getting other benefits of -glldb.

Reviewers: probinson, aprantl, dblaikie

Subscribers: emaste, dim, llvm-commits, JDevlieghere

Differential Revision: https://reviews.llvm.org/D41986

llvm-svn: 322633
2018-01-17 11:52:13 +00:00
Sean Eveson 2ae6037dd1 [MC] Fix -stack-size-section on ARM
Change symbol values in the stack_size section from being 8 bytes, to being a target dependent size.

Differential Revision: https://reviews.llvm.org/D42108

llvm-svn: 322619
2018-01-17 09:01:29 +00:00
Francis Visoiu Mistrih 54a9e7a400 [CodeGen] Skip some instructions that shouldn't affect shrink-wrapping
r320606 checked for MI.isMetaInstruction which skips all DBG_VALUEs.

This also skips IMPLICIT_DEFs and other instructions that may def / read
a register.

Differential Revision: https://reviews.llvm.org/D42119

llvm-svn: 322584
2018-01-16 18:55:26 +00:00
Petar Jovanovic 0b464e4f0e [LiveDebugValues] recognize spilled reg killed in instruction after spill
Current condition for spill instruction recognition in LiveDebugValues does
not recognize case when register is spilled and killed in next instruction.

Patch by Nikola Prica.

Differential Revision: https://reviews.llvm.org/D41226

llvm-svn: 322554
2018-01-16 14:46:05 +00:00
Francis Visoiu Mistrih 5eaddb3f68 [CodeGen] Remove special case of printing subRegIdx from MachineInstr::print
Support in MachineOperand has been added in r320209. No need to special
case this anymore.

llvm-svn: 322542
2018-01-16 10:53:14 +00:00