Commit Graph

33214 Commits

Author SHA1 Message Date
Jonas Paulsson f12b925bb1 [Stack realignment] Handling of aligned allocas.
This patch implements dynamic realignment of stack objects for targets
with a non-realigned stack pointer. Behaviour in FunctionLoweringInfo
is changed so that for a target that has StackRealignable set to
false, over-aligned static allocas are considered to be variable-sized
objects and are handled with DYNAMIC_STACKALLOC nodes.

It would be good to group aligned allocas into a single big alloca as
an optimization, but this is yet todo.

SystemZ benefits from this, due to its stack frame layout.

New tests SystemZ/alloca-03.ll for aligned allocas, and
SystemZ/alloca-04.ll for "no-realign-stack" attribute on functions.

Review and help from Ulrich Weigand and Hal Finkel.

llvm-svn: 254227
2015-11-28 11:02:32 +00:00
Rafael Espindola 5aafbac081 Pass .ll directly to llvm-link.
llvm-svn: 254214
2015-11-27 23:47:15 +00:00
Rafael Espindola 57e61231ad Pass .ll directly to llvm-link
llvm-svn: 254213
2015-11-27 23:21:45 +00:00
Diego Novillo 84f06cc835 SamplePGO - Add initial support for inliner annotations.
This adds two thresholds to the sample profiler to affect inlining
decisions: the concept of global hotness and coldness.

Functions that have accumulated more than a certain fraction of samples at
runtime, are annotated with the InlineHint attribute. Conversely,
functions that accumulate less than a certain fraction of samples, are
annotated with the Cold attribute.

This is very similar to the hints emitted by Clang when using
instrumentation profiles.

Notice that this is a very blunt instrument. A function may have
globally collected a significant fraction of samples, but that does not
necessarily mean that every callsite for that function is hot.

Ideally, we would annotate each callsite with the samples collected at
that callsite. This way, the inliner can incorporate all these weights
into its cost model.

Once the inliner offers this functionality, we can change the hints
emitted here to a more precise per-callsite annotation. For now, this is
providing some measure of speedups with our internal benchmarks. I've
observed speedups of up to 23% (though the geo mean is about 3%). I expect
these numbers to improve as the inliner gets better annotations.

llvm-svn: 254212
2015-11-27 23:14:51 +00:00
Rafael Espindola 138f895655 Modernize the test a bit
Remove out of date comment.
Pass .ll files to llvm-link.

llvm-svn: 254210
2015-11-27 23:13:17 +00:00
Artyom Skrobov b955b90509 [ARM] Generate ABI_optimization_goals build attribute, as described in the ARM ARM.
Summary:
Since this build attribute corresponds to a whole module, and
different functions in a module may differ in the optimizations
enabled for them, this attribute is emitted after all functions,
and only in the case that the optimization goals for all
functions match.

Reviewers: logan, hans

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D14934

llvm-svn: 254201
2015-11-27 15:30:51 +00:00
Oliver Stannard b25914e03f [AArch64] Add ARMv8.2-A FP16 scalar instructions
ARMv8.2-A adds 16-bit floating point versions of all existing VFP
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.

Most of these instructions are the same as the 32- and 64-bit versions,
but with the type field (bits 23-22) set to 0b11. Previously the top bit
of the size field was always 0, so the instruction classes only provided
a 1-bit size field, which I have widened to 2 bits.

Differential Revision: http://reviews.llvm.org/D15014

llvm-svn: 254198
2015-11-27 13:04:48 +00:00
Adhemerval Zanella d93c0c4dc4 [sanitizer] [dfsan] Unify aarch64 mapping
This patch changes the DFSan instrumentation for aarch64 to instead
of using fixes application mask defined by SANITIZER_AARCH64_VMA
to read the application shadow mask value from compiler-rt. The value
is initialized based on runtime VAM detection.

Along with this patch a compiler-rt one will also be added to export
the shadow mask variable.

llvm-svn: 254196
2015-11-27 12:42:39 +00:00
Andrew Wilkins 522eb9c57d test: bail early if tool_path is None
tool_path will be None for llvm-go if Go cannot be found

llvm-svn: 254190
2015-11-27 05:07:26 +00:00
Andrew Wilkins 572fe6e95e test: check if go_executable is set
llvm-svn: 254189
2015-11-27 04:51:13 +00:00
Andrew Wilkins caa3b51ad2 Use $GO_EXECUTABLE in Go-based lit tests
Summary:
When running tests, pass the GO_EXECUTABLE CMake
cache variable to llvm-go. The "go" binary may
not be in $PATH, or may be different to the one
passed to CMake.

Reviewers: pcc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14041

llvm-svn: 254187
2015-11-27 04:44:51 +00:00
Rafael Espindola 8e8183b8bd Test both input file orders.
llvm-svn: 254186
2015-11-27 03:50:34 +00:00
Rafael Espindola 60b57863a0 Add missing file.
llvm-svn: 254185
2015-11-27 03:47:29 +00:00
Rafael Espindola 1d3465f641 Make the test a bit more interesting.
It now covers a regular function replacing an available_externally one.

llvm-svn: 254184
2015-11-27 02:07:37 +00:00
Peter Collingbourne 8359a6a83e MC: Simplify handling of temporary symbols in COFF writer.
The COFF object writer was previously adding unnecessary symbols to its
temporary data structures and cleaning them up later. This made the code
harder to understand and caused a bug (aliases classed as temporary symbols
would cause an assertion failure). A much simpler way of handling such
symbols is to ask the layout for their section-relative position when needed.

Tested with a bootstrap on Windows and by building Chrome.

Differential Revision: http://reviews.llvm.org/D14975

llvm-svn: 254183
2015-11-26 23:29:27 +00:00
Simon Pilgrim 1d881ae225 [X86][FMA] Begun adding AVX512 FMA tests
As discussed on D14909

llvm-svn: 254180
2015-11-26 20:53:28 +00:00
Charlie Turner 54336a5a4e [LoopVectorize] Use MapVector rather than DenseMap for MinBWs.
The order in which instructions are truncated in truncateToMinimalBitwidths
effects code generation. Switch to a map with a determinisic order, since the
iteration order over a DenseMap is not defined.

This code is not hot, so the difference in container performance isn't
interesting.

Many thanks to David Blaikie for making me aware of MapVector!

Fixes PR25490.

Differential Revision: http://reviews.llvm.org/D14981

llvm-svn: 254179
2015-11-26 20:39:51 +00:00
Rafael Espindola b38f7b5adc Add a few passing lto tests.
I found these while trying to get a prototype to bootstrap.

They cover things like
* Handling of non linker visible stuff (append, available_externally)
* Type merging
* Alias to dropped globals
* Dropping linkage when converting to a declaration.

These should hopefully be generally useful for anyone refactoring the
plugin.

llvm-svn: 254174
2015-11-26 19:53:12 +00:00
Rafael Espindola 8934577171 Disallow aliases to available_externally.
They are as much trouble as aliases to declarations. They are requiring
the code generator to define a symbol with the same value as another
symbol, but the second symbol is undefined.

If representing this is important for some optimization, we could add
support for available_externally aliases. They would be *required* to
point to a declaration (or available_externally definition).

llvm-svn: 254170
2015-11-26 19:22:59 +00:00
Krzysztof Parzyszek 4eb6d4d1f2 [Hexagon] Hexagon V60 HVX intrinsic defintions
Author: Ron Lieberman <ronl@codeaurora.org>
llvm-svn: 254165
2015-11-26 16:54:33 +00:00
Daniel Sanders daa4b6fbd9 [mips][ias] Range check uimm5 operands and fix several bugs this revealed.
Summary:
The bugs were:
* append, prepend, and balign were not tested
* balign takes a uimm2 not a uimm5.
* drotr32 was correctly implemented with a uimm5 but the tests expected
  '52' to be valid.
* li/la were implemented with a uimm5 instead of simm32. simm32 isn't
  completely correct either but I'll fix that when I get to simm32.

A notable omission are some of the shift instructions. Several of these
have been implemented using a single uimm6 instruction (rather than two
uimm5 instructions and a CodeGen-only uimm6 pseudo). These will be updated
in the uimm6 patch.

Reviewers: vkalintiris

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D14712

llvm-svn: 254164
2015-11-26 16:35:41 +00:00
Oliver Stannard 64c167db7a [AArch64] Add ARMv8.2-A new AT instruction variants
ARMv8.2-A adds new variants of the "at" (address translate) system
instruction, which take the PSTATE.PAN bit (added in ARMv8.1-A). These
are a required part of ARMv8.2-A, so no additional subtarget features
are required.

Differential Revision: http://reviews.llvm.org/D15018

llvm-svn: 254159
2015-11-26 15:34:44 +00:00
Martell Malone d12292480a ARM: address WOA unsigned division overflow crash
Building on r253865 the crash is not limited to signed overflows.

Disable custom handling of unsigned 32-bit and 64-bit integer divide.
Add test cases for both 32-bit and 64-bit unsigned integer overflow.

llvm-svn: 254158
2015-11-26 15:34:03 +00:00
Oliver Stannard 911ea20f07 [AArch64] Add ARMv8.2-A UAO PSTATE bit
ARMv8.2-A adds a new PSTATE bit, PSTATE.UAO, which allows the LDTR/STTR
instructions to behave the same as LDR/STR with respect to execute-only
pages at higher privilege levels. New variants of the MSR/MRS
instructions are added to allow reading and writing this bit. It is a
required part of ARMv8.2-A, so no additional subtarget features are
required.

Differential Revision: http://reviews.llvm.org/D15020

llvm-svn: 254157
2015-11-26 15:32:30 +00:00
Oliver Stannard 1a81cc9f43 [AArch64] Add ARMv8.2-A persistent memory instruction
ARMv8.2-A adds the "dc cvap" instruction, which is a system instruction
that cleans caches to the point of persistence (for systems that have
persistent memory). It is a required part of ARMv8.2-A, so no additional
subtarget features are required.

Differential Revision: http://reviews.llvm.org/D15016

llvm-svn: 254156
2015-11-26 15:28:47 +00:00
Oliver Stannard 48b43741d0 [AArch64] Add ARMv8.2-A ID_A64MMFR2_EL1 register
ARMv8.2-A adds a new ID register, ID_A64MMFR2_EL1, which behaves in the
same way as ID_A64MMFR0_EL1 and ID_A64MMFR1_EL1. It is a required part
of ARMv8.2-A, so no additional subtarget features are required.

Differential Revision: http://reviews.llvm.org/D15017

llvm-svn: 254155
2015-11-26 15:26:10 +00:00
Daniel Sanders fbb6a237ba [mips][ias] Explicitly disable IAS on tests that depend on not assembling.
Summary:
no-odd-spreg-msa.ll: This test deliberately uses an odd-numbered register
in inline assembly and expects the compiler to insert a move to an
even-numbered register.

inlineasm-operand-code.ll and inlineasm_constraint.ll:
Checks for IAS's output will be added once a matcher bug is resolved. This bug
causes the canonical output emitted by IAS to be incorrect for uimm16 constants
with the MSB set. We will still need the non-IAS checks at this point since
these tests primarily test formatting of operands.

Reviewers: vkalintiris

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D14705

llvm-svn: 254148
2015-11-26 11:23:03 +00:00
Daniel Sanders f2b5bff843 [mips][ias] Replace anchor comments with anchor instructions in tests.
Summary:
This is because IAS will delete the comments. NFC at the moment but it will
prevent a failure once IAS is the default.

Reviewers: vkalintiris

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D14704

llvm-svn: 254147
2015-11-26 10:26:18 +00:00
Benjamin Kramer fb419e71f4 [SimplifyLibCalls] Don't depend on a called function having a name, it might be an indirect call.
Fixes the crasher in PR25651 and related crashers using the same pattern.

llvm-svn: 254145
2015-11-26 09:51:17 +00:00
Vyacheslav Klochkov ed865dfcc5 X86-FMA3: Improved/enabled the memory folding optimization for scalar loads
generated for _mm_losd_s{s,d}() intrinsics and used in scalar FMAs generated 
for FMA intrinsics _mm_f{madd,msub,nmadd,nmsub}_s{s,d}().

Reviewer: David Kreitzer
Differential Revision: http://reviews.llvm.org/D14762

llvm-svn: 254140
2015-11-26 07:45:30 +00:00
Sanjoy Das bcd150362a [OperandBundles] Treat "deopt" operand bundles specially
Teach LLVM optimize to more precisely in the presence of "deopt" operand
bundles.  "deopt" operand bundles imply that the call they're attached
to is at least `readonly` (i.e. they don't imply clobber semantics), and
they don't capture their bundle operands.

llvm-svn: 254118
2015-11-26 01:16:05 +00:00
Tom Stellard 48f29f21ee AMDGPU: Add llvm.amdgcn.dispatch.ptr intrinsic
Summary:
This returns a pointer to the dispatch packet, which can be used to load
information about the kernel dispach.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D14898

llvm-svn: 254116
2015-11-26 00:43:29 +00:00
Evgeniy Stepanov 9842d61ca4 [safestack] Fix alignment of dynamic allocas.
Fixes PR25588.

llvm-svn: 254109
2015-11-25 22:52:30 +00:00
Dan Gohman a774d719a0 [WebAssembly] Fix inline asm support for i64 operands.
llvm-svn: 254106
2015-11-25 22:28:50 +00:00
Dan Gohman d9b4218831 [WebAssembly] Fold setne and seteq comparisons into selects.
llvm-svn: 254104
2015-11-25 22:13:48 +00:00
Marek Olsak 7ed6b2f414 AMDGPU/SI: select S_ABS_I32 when possible (v2)
v2: added more tests, moved the SALU->VALU conversion to a separate function

It looks like it's not possible to get subregisters in the S_ABS lowering
code, and I don't feel like guessing without testing what the correct code
would look like.

llvm-svn: 254095
2015-11-25 21:22:45 +00:00
Krzysztof Parzyszek 207c13f254 Add hexagonv55 and hexagonv60 as recognized CPUs, make v60 the default
llvm-svn: 254089
2015-11-25 20:30:59 +00:00
Matt Arsenault d179481857 AMDGPU: Add some tests for promotion of v2i64 scalar_to_vector
llvm-svn: 254087
2015-11-25 20:01:03 +00:00
Matt Arsenault 61001bbc03 AMDGPU: Make v2i64/v2f64 legal types.
They can be loaded and stored, so count them as legal. This is
mostly to fix a number of common cases for load/store merging.

llvm-svn: 254086
2015-11-25 19:58:34 +00:00
Dan Gohman fb3e0594e4 [WebAssembly] Use a physical register to describe ARGUMENT liveness.
Instead of trying to move ARGUMENT instructions back up to the top after
they've been scheduled or sunk down, use a fake physical register to
create a liveness constraint that prevents ARGUMENT instructions from
moving down in the first place. This is still not entirely ideal, however
it is more robust than letting them move and moving them back.

llvm-svn: 254084
2015-11-25 19:36:19 +00:00
Dan Gohman 1270b0a91d [WebAssembly] Make several tests more strict.
llvm-svn: 254077
2015-11-25 17:33:15 +00:00
Dan Gohman 81719f8555 [WebAssembly] Support for register stackifying with load and store instructions.
llvm-svn: 254076
2015-11-25 16:55:01 +00:00
Dan Gohman 2c8fe6a428 [WebAssembly] Codegen support for ISD::ExternalSymbol
llvm-svn: 254075
2015-11-25 16:44:29 +00:00
Hal Finkel 005f840959 [PowerPC] Don't generate mfocrf on the e500mc
The e500mc does not actually support the mfocrf instruction; update the
processor definitions to reflect that fact.

Patch by Tom Rix (with some test-case cleanup by me).

llvm-svn: 254064
2015-11-25 10:14:31 +00:00
Eric Christopher f83a2c2db2 Accept any stack offset, including none, here.
llvm-svn: 254062
2015-11-25 09:21:36 +00:00
Eric Christopher 4675c439aa Fix some places where we were assuming that memory type had been legalized
to a simple type when lowering a truncating store of a vector type. In this
case for an EVT we'll return Expand as we should in all of the cases anyhow.

The testcase triggered at the one in VectorLegalizer::LegalizeOp, inspection
found the rest.

llvm-svn: 254061
2015-11-25 09:11:53 +00:00
Simon Pilgrim c85c49c665 [X86][AVX] Regenerate Splat OptSize tests
Tidied up triple and regenerate tests using update_llc_test_checks.py

llvm-svn: 254060
2015-11-25 09:06:17 +00:00
Elena Demikhovsky f07df9fcac AVX-512: Fixed a bug in VPERMT2* intrinsic.
It was wrong order of operands (from intrinsic to DAG node).
I added more strict type specification for instruction selection.

Differential Revision: http://reviews.llvm.org/D14942

llvm-svn: 254059
2015-11-25 08:17:56 +00:00
Peter Collingbourne 463ff6d823 AsmParser: Make the code for parsing unnamed aliases more closely resemble that for unnamed globals.
This fixes parsing of forward references to unnamed aliases.

While here, remove an unnecessary isa check.

llvm-svn: 254054
2015-11-25 02:54:07 +00:00
Sanjoy Das 7629346193 [InstCombine] Don't drop operand bundles
Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14857

llvm-svn: 254046
2015-11-25 00:42:19 +00:00
Hans Wennborg e412b71f95 Revert r253528: "[X86] Enable shrink-wrapping by default."
This caused PR25607 and also caused Chromium to crash on start-up.

(Also had to update test/CodeGen/X86/avx-splat.ll, which was committed
after shrink wrapping was enabled.)

llvm-svn: 254044
2015-11-25 00:05:13 +00:00
Rong Xu 55fa418a90 Revert r254021
llvm-svn: 254042
2015-11-24 23:57:51 +00:00
Rong Xu 25c106b347 [PGO] Revert revision r254021,r254028,r254035
Revert the above revision due to multiple issues.

llvm-svn: 254040
2015-11-24 23:49:08 +00:00
Teresa Johnson 3930361969 [ThinLTO] Add option to limit importing based on instruction count
Add a simple initial heuristic to control importing based on the number
of instructions recorded in the function's summary. Add option to
control the limit, and test using option.

llvm-svn: 254036
2015-11-24 22:55:46 +00:00
Rong Xu 88cb57aba9 [PGO] Relax test cases in PGO instrumentation
Fix buildbot failure for clang-x86_64-linux-selfhost-modules.
http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/8866
The failing test cases are newly added from r254021. It seems the IR has a
different order in this platform. In this patch, I temporarily relax the test
case to make the build green. I'll have a complete fix (more robust way to test)
soon.

llvm-svn: 254035
2015-11-24 22:50:34 +00:00
Diego Novillo 0b6985a3c6 SamplePGO - Add test for hot/cold inlined functions.
When the original binary is executed and sampled, the resulting profile
contains information on the original inline stack. We currently follow
the original inline plan if we notice that the inlined callsite has more
than 0 samples to it.

A better way is to determine whether the callsite is actually worth
inlining. If the callsite accumulates a small fraction of the samples
spent in the parent function, then we don't want to bother inlining it
(as it means that the callsite is actually cold).

This patch introduces a threshold expressed in percentage of samples
in relation to the parent function.  If the callsite uses less than N%
of the total samples used by its parent, the original inline decision is
not re-applied.

I've set the threshold to the very arbitrary value of 5%. I'm yet to do
any actual experiments to see what's a good value. I wanted to separate
the basic mechanism from the tuning.

llvm-svn: 254034
2015-11-24 22:38:37 +00:00
Simon Pilgrim c1225c28e1 [X86][SSE] Regenerate PMUL tests
Tidied up triple and regenerate tests using update_llc_test_checks.py

llvm-svn: 254029
2015-11-24 22:09:31 +00:00
Evgeniy Stepanov b05d380451 [msan] Relax origin-alignment test.
Change origin-alignment test to test only the alignment of the origin
store, and not the exact instruction sequence used to compute the
address. This makes the test less fragile and, in particular, lets it
pass both with the old and new MSan ABIs.

llvm-svn: 254027
2015-11-24 21:44:16 +00:00
Rong Xu 1b665ca707 [PGO] MST based PGO instrumentation infrastructure
This patch implements a minimum spanning tree (MST) based instrumentation for
PGO. The use of MST guarantees minimum number of CFG edges getting
instrumented. An addition optimization is to instrument the less executed
edges to further reduce the instrumentation overhead. The patch contains both the
instrumentation and the use of the profile to set the branch weights.

Differential Revision: http://reviews.llvm.org/D12781

llvm-svn: 254021
2015-11-24 21:31:25 +00:00
Simon Pilgrim 1b4fecb098 [X86][FMA] Optimize FNEG(FMA) Patterns
X86 needs to use its own FMA opcodes, preventing the standard FNEG(FMA) pattern table recognition method used by other platforms. This patch adds support for lowering FNEG(FMA(X,Y,Z)) into a single suitably negated FMA instruction.

Fix for PR24364

Differential Revision: http://reviews.llvm.org/D14906

llvm-svn: 254016
2015-11-24 20:31:46 +00:00
Teresa Johnson 130de7af7f [ThinLTO] Enable iterative importing in FunctionImport pass
Analyze imported function bodies and add any new external calls to
the worklist for importing. Currently no controls on the importing
so this will end up importing everything possible in the call tree
below the importing module. Basic profitability checks coming next.

Update test to check for iteratively inlined functions.

llvm-svn: 254011
2015-11-24 19:55:04 +00:00
Cong Hou db6220f84d [X86] Fix several issues related to X86's psadbw instruction.
This patch fixes the following issues:

1. Fix the return type of X86psadbw: it should not be the same type of inputs.
   For vNi8 inputs the output should be vMi64, where M = N/8.
2. Fix the return type of int_x86_avx512_psad_bw_512 accordingly.
3. Fix the definiton of PSADBW, VPSADBW, and VPSADBWY accordingly.
4. Adjust the return type when building a DAG node of X86ISD::PSADBW type.
5. Update related tests.


Differential revision: http://reviews.llvm.org/D14897

llvm-svn: 254010
2015-11-24 19:51:26 +00:00
Teresa Johnson b098f0c133 [ThinLTO] Handle previously imported and promoted locals in module linker
The new function import pass exposed an issue when we import references
to local values on multiple importing passes. They are renamed on each
import pass, and we need to ensure that the already promoted and renamed
references existing in the dest module are correctly identified and
updated so that they aren't spuriously renamed again (due to a perceived
conflict with the newly linked reference).

llvm-svn: 254009
2015-11-24 19:46:58 +00:00
Sanjay Patel 968e91aea0 [InstCombine] fix propagation of fast-math-flags
Noticed while working on D4583:
http://reviews.llvm.org/D4583

llvm-svn: 253997
2015-11-24 17:51:20 +00:00
Rafael Espindola 383d1f6aa6 Make this test a bit more strict.
It now tests with files in both orders.

llvm-svn: 253993
2015-11-24 16:43:53 +00:00
Teresa Johnson 17626654fd [ThinLTO] Fix FunctionImport alias checking and test
Skip imports for weak_any aliases as well. Fix the test to check
non-import of weak aliases and functions, and import of normal alias.

llvm-svn: 253991
2015-11-24 16:10:43 +00:00
Sanjay Patel a0d354541d [x86] remove duplicate movq instruction defs (PR25554)
We had duplicated definitions for the same hardware '[v]movq' instructions. For example with SSE:

  def MOVZQI2PQIrr : RS2I<0x6E, MRMSrcReg, (outs VR128:$dst), (ins GR64:$src),
                     "mov{d|q}\t{$src, $dst|$dst, $src}", // X86-64 only
                     [(set VR128:$dst, (v2i64 (X86vzmovl (v2i64 (scalar_to_vector GR64:$src)))))],
                     IIC_SSE_MOVDQ>;

  def MOV64toPQIrr : RS2I<0x6E, MRMSrcReg, (outs VR128:$dst), (ins GR64:$src),
                     "mov{d|q}\t{$src, $dst|$dst, $src}",
                     [(set VR128:$dst, (v2i64 (scalar_to_vector GR64:$src)))],
                     IIC_SSE_MOVDQ>, Sched<[WriteMove]>;

As shown in the test case and PR25554:
https://llvm.org/bugs/show_bug.cgi?id=25554

This causes us to miss reusing an operand because later passes don't know these 'movq' are the same instruction.
This patch deletes one pair of these defs.
Sadly, this won't fix the original test case in the bug report. Something else is still broken.

Differential Revision: http://reviews.llvm.org/D14941

llvm-svn: 253988
2015-11-24 15:44:35 +00:00
Rafael Espindola 23117e5a7b Add an already passing test.
This tests that a declaration can resolve to an alias.

I broke this locally while prototyping a change and it looks like a nice
test to have.

llvm-svn: 253984
2015-11-24 14:15:50 +00:00
Krzysztof Parzyszek d4b566d50b Add new vector types for 512-, 1024- and 2048-bit vectors
Those types are needed to implement instructions for Hexagon Vector
Extensions (HVX): 16x32, 16x64, 32x16, 32x32, 32x64, 64x8, 64x16,
64x32, 128x8, 128x16, 256x8, 512x1, and 1024x1.

llvm-svn: 253978
2015-11-24 13:07:35 +00:00
Matt Arsenault ff05da806c AMDGPU: Split LDS vector loads
If properly aligned this could allow using ds_read_b64.

llvm-svn: 253975
2015-11-24 12:18:54 +00:00
Matt Arsenault 4d801cd357 AMDGPU: Split x8 and x16 vector loads instead of scalarize
The one regression in the builtin tests is in the read2 test which now
(again) has many extra copies, but this should be solved once the pass
is replaced with a DAG combine.

llvm-svn: 253974
2015-11-24 12:05:03 +00:00
Cong Hou 1938f2eb98 Let SelectionDAG start to use probability-based interface to add successors.
The patch in http://reviews.llvm.org/D13745 is broken into four parts:

1. New interfaces without functional changes.
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights.
3. Use new interfaces in all other passes.
4. Remove old interfaces.

This the second patch above. In this patch SelectionDAG starts to use
probability-based interfaces in MBB to add successors but other MC passes are
still using weight-based interfaces. Therefore, we need to maintain correct
weight list in MBB even when probability-based interfaces are used. This is
done by updating weight list in probability-based interfaces by treating the
numerator of probabilities as weights. This change affects many test cases
that check successor weight values. I will update those test cases once this
patch looks good to you.


Differential revision: http://reviews.llvm.org/D14361

llvm-svn: 253965
2015-11-24 08:51:23 +00:00
Mehdi Amini 42418aba58 Add a FunctionImporter helper to perform summary-based cross-module function importing
Summary:
This is a helper to perform cross-module import for ThinLTO. Right now
it is importing naively every possible called functions.

Reviewers: tejohnson

Subscribers: dexonsmith, llvm-commits

Differential Revision: http://reviews.llvm.org/D14914

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 253954
2015-11-24 06:07:49 +00:00
Cong Hou bed60d35ed [X86][SSE] Detect AVG pattern during instruction combine for SSE2/AVX2/AVX512BW.
This patch detects the AVG pattern in vectorized code, which is simply
c = (a + b + 1) / 2, where a, b, and c have the same type which are vectors of
either unsigned i8 or unsigned i16. In the IR, i8/i16 will be promoted to
i32 before any arithmetic operations. The following IR shows such an example:

%1 = zext <N x i8> %a to <N x i32>
%2 = zext <N x i8> %b to <N x i32>
%3 = add nuw nsw <N x i32> %1, <i32 1 x N>
%4 = add nuw nsw <N x i32> %3, %2
%5 = lshr <N x i32> %N, <i32 1 x N>
%6 = trunc <N x i32> %5 to <N x i8>

and with this patch it will be converted to a X86ISD::AVG instruction.

The pattern recognition is done when combining instructions just before type
legalization during instruction selection. We do it here because after type
legalization, it is much more difficult to do pattern recognition based
on many instructions that are doing type conversions. Therefore, for
target-specific instructions (like X86ISD::AVG), we need to take care of type
legalization by ourselves. However, as X86ISD::AVG behaves similarly to
ISD::ADD, I am wondering if there is a way to legalize operands and result
types of X86ISD::AVG together with ISD::ADD. It seems that the current design
doesn't support this idea.

Tests are added for SSE2, AVX2, and AVX512BW and both i8 and i16 types of
variant vector sizes.


Differential revision: http://reviews.llvm.org/D14761

llvm-svn: 253952
2015-11-24 05:44:19 +00:00
Sanjay Patel 8ca4a5b9e5 minimize test case but still show the bug
llvm-svn: 253940
2015-11-24 00:11:48 +00:00
Sanjay Patel 16fcf25eb9 added comment (using freshly updated update_llc_test_checks.py)
llvm-svn: 253935
2015-11-23 23:22:05 +00:00
Sanjay Patel d6e0cb01b1 [x86] add test to show suboptimal codegen (PR25554)
llvm-svn: 253934
2015-11-23 23:18:20 +00:00
Krzysztof Parzyszek d5d083ccd4 Revert r253923.
Per Eric's request.

llvm-svn: 253928
2015-11-23 22:19:57 +00:00
Andy Ayers 9f7501896e findDeadCallerSavedReg needs to pay attention to calling convention
Caller saved regs differ between SysV and Win64. Use the tail call available set to scavenge from.

Refactor register info to create new helper to get at tail call GPRs. Added a new test case for windows. Fixed up a number of X64 tests since now RCX is preferred over RDX on SysV.

Differential Revision: http://reviews.llvm.org/D14878

llvm-svn: 253927
2015-11-23 22:17:44 +00:00
Dan Gohman 2f16f25391 [WebAssembly] Don't special-case call operand order.
With the '=' suffix now indicating which operands are output operands, it's
no longer as important to distinguish between a call's inputs and its outputs
using operand ordering, so we can go back to printing them in the normal order.

llvm-svn: 253925
2015-11-23 22:04:06 +00:00
Krzysztof Parzyszek f358bfff17 Add new vector types for 512-, 1024- and 2048-bit vectors
Those types are needed to implement instructions for Hexagon Vector
Extensions (HVX): 16x32, 16x64, 32x16, 32x32, 32x64, 64x8, 64x16,
64x32, 128x8, 128x16, 256x8, 512x1, and 1024x1.

llvm-svn: 253923
2015-11-23 22:00:17 +00:00
Dan Gohman 700515fa92 [WebAssembly] Suffix output operands with '='.
This distinguishes input operands from output operands. This is something of
a syntactic experiment to see whether the mild amount of clutter this adds is
outweighed by the extra information it conveys to the reader.

llvm-svn: 253922
2015-11-23 21:55:57 +00:00
Sanjoy Das d5658b0896 [RuntimeDyld] Don't allocate unnecessary stub buffer space
Summary:
For relocation types that are known to not require stub functions, there
is no need to allocate extra space for the stub functions.

Reviewers: lhames, reames, maksfb

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14676

llvm-svn: 253920
2015-11-23 21:47:51 +00:00
James Y Knight 7c905063c5 Make utils/update_llc_test_checks.py note that the assertions are
autogenerated.

Also update existing test cases which appear to be generated by it and
weren't modified (other than addition of the header) by rerunning it.

llvm-svn: 253917
2015-11-23 21:33:58 +00:00
Dan Gohman 7054ac1b8b [WebAssembly] Model the return value of store instructions in wasm.
llvm-svn: 253916
2015-11-23 21:16:35 +00:00
Xinliang David Li 6f7c19a494 [PGO] Add --text option for llvm-profdata show|merge commands
The new option is similar to the SampleProfile dump option.

- dump raw/indexed format into text profile format
- merge the profile and output into text profile format.

Note that Value Profiling data text format is not yet designed. 
That functionality will be added later.

Differential Revision: http://reviews.llvm.org/D14894

llvm-svn: 253913
2015-11-23 20:47:38 +00:00
Diego Novillo 243ea6a7d6 SamplePGO - Add coverage tracking for samples.
The existing coverage tracker counts the number of records that were used
from the input profile. An alternative view of coverage is to check how
many available samples were applied.

This way, if the profile contains several records with few samples, it
doesn't really matter much that they were not applied. The more
interesting records to apply are the ones that contribute many samples.

llvm-svn: 253912
2015-11-23 20:12:21 +00:00
Andrew Kaylor 0615a0e65d [WinEH] Fix a case where GVN could incorrectly PRE a load into an EH pad.
Differential Revision: http://reviews.llvm.org/D14842

llvm-svn: 253908
2015-11-23 19:51:41 +00:00
Dan Gohman aa0a4bd05b [WebAssembly] Don't use set_local instructions explicitly.
The current approach to using get_local and set_local is to use them
implicitly, as register uses and defs. Introduce new copy instructions
which are themselves no-ops except for the get_local and set_local
that they imply, so that we use get_local and set_local consistently.

llvm-svn: 253905
2015-11-23 19:30:43 +00:00
Andrew Kaylor d0430e8580 [WinEH] Fix problem where CodeGenPrepare incorrectly sinks a bitcast into an EH pad.
Differential Revision: http://reviews.llvm.org/D14842

llvm-svn: 253902
2015-11-23 19:16:15 +00:00
Dan Gohman f6857223c9 [WebAssembly] Always print loop end labels
WebAssembly is currently using labels to end scopes, so for example a
loop scope looks like this:

BB0_0:
  loop BB0_1
  ...
BB0_1:

with BB0_0 being the label of the first block not in the loop. This
requires that the label be printed even when it's only reachable via
fallthrough. To arrange this, insert a no-op LOOP_END instruction in
such cases at the end of the loop.

llvm-svn: 253901
2015-11-23 19:12:37 +00:00
Dan Gohman 53828fd777 [WebAssembly] Emit .param, .result, and .local through MC.
This eliminates one of the main remaining uses of EmitRawText.

llvm-svn: 253878
2015-11-23 16:50:18 +00:00
Dan Gohman 3280793234 [WebAssembly] Use dominator information to improve BLOCK placement
Always starting blocks at the top of their containing loops works, but creates
unnecessarily deep nesting because it makes all blocks in a loop overlap.
Refine the BLOCK placement algorithm to start blocks at nearest common
dominating points instead, which significantly shrinks them and reduces
overlapping.

llvm-svn: 253876
2015-11-23 16:19:56 +00:00
Daniel Sanders 2b561336d9 [mips] .ent and .end should also set the type and size of the symbol respectively.
Reviewers: vkalintiris

Subscribers: llvm-commits, seanbruno, emaste, vkalintiris, dsanders

Differential Revision: http://reviews.llvm.org/D14221

llvm-svn: 253875
2015-11-23 16:08:03 +00:00
Martell Malone a6b867eb0d ARM: address WoA division overflow crash
Disable custom handling of signed 32-bit and 64-bit integer divide.
Add test cases for both 32-bit and 64-bit integer overflow crashes.

llvm-svn: 253865
2015-11-23 13:11:39 +00:00
Simon Pilgrim 806c42a747 [X86][FMA] Regenerate tests.
Fixes some broken checks.

llvm-svn: 253830
2015-11-22 19:05:53 +00:00
Simon Pilgrim a8e9c8d3da [X86][AVX] Added load splat tests.
Placeholder for upcoming patch for PR23022.

llvm-svn: 253824
2015-11-22 16:52:16 +00:00
Elena Demikhovsky 0fd11526e2 AVX-512: Optimized INSERT_SUBVECTOR for i1 vector types
ISERT_SUBVECTOR for i1 vectors may be done with shifts, when we insert into the lower part, or into the upper part, on into all-zero vector.
CONCAT_VECTORS uses ISERT_SUBVECTOR.

Differential Revision: http://reviews.llvm.org/D14815

llvm-svn: 253819
2015-11-22 13:57:38 +00:00
Rafael Espindola d1beb07d39 Have a single way for creating unique value names.
We had two code paths. One would create names like "foo.1" and the other
names like "foo1".

For globals it is important to use "foo.1" to help C++ name demangling.
For locals there is no strong reason to go one way or the other so I
kept the most common mangling (foo1).

llvm-svn: 253804
2015-11-22 00:16:24 +00:00
Teresa Johnson 6290dbc0f7 [ThinLTO] Handle bitcode without function summary sections gracefully
Summary:
Several fixes to the handling of bitcode files without function summary
sections so that they are skipped during ThinLTO processing in llvm-lto
and the gold plugin when appropriate instead of aborting.

1 Don't assert when trying to add a FunctionInfo that doesn't have
  a summary attached.
2 Skip FunctionInfo structures that don't have attached function summary
  sections when trying to create the combined function summary.
3 In both llvm-lto and gold-plugin, check whether a bitcode file has
  a function summary section before trying to parse the index, and skip
  the bitcode file if it does not.
4 Fix hasFunctionSummaryInMemBuffer in BitcodeReader, which had a bug
  where we returned to early while looking for the summary section.

Also added llvm-lto and gold-plugin based tests for cases where we
don't have function summaries in the bitcode file. I verified that
either the first couple fixes described above are enough to avoid the
crashes, or fixes 1,3,4. But have combined them all here for added
robustness.

Reviewers: joker.eph

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D14903

llvm-svn: 253796
2015-11-21 21:55:48 +00:00