Commit Graph

104110 Commits

Author SHA1 Message Date
Teresa Johnson 538b8d25f0 Add zero-length check to memcpy/memset load store loop expansion
Summary:
I was testing using this expansion logic in other cases besides
NVPTX, and found some runtime failures due to the lack of a check
for a zero length memcpy/memset before the loop. There is already
such a check in the memmove expansion code though.

Reviewers: hfinkel

Subscribers: jholewinski, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D34707

llvm-svn: 306541
2017-06-28 13:07:37 +00:00
Nikolai Bozhenov 6710ba07c7 Revert r306528
llvm-svn: 306536
2017-06-28 12:15:13 +00:00
Igor Breger d5b59cf914 [GlobalISel][X86] Support bitwise operations : G_AND, G_OR, G_XOR
Summary: Support G_AND, G_OR, G_XOR for i8/i16/i32/i64. Selection done via TableGen'erated code.

Reviewers: zvi, guyblank, aymanmus, m_zuckerman

Reviewed By: aymanmus

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34605

llvm-svn: 306533
2017-06-28 11:39:04 +00:00
Michael Zuckerman f66840020c Reverting commit 306414 on behalf of @gadi.haber
llvm-svn: 306532
2017-06-28 11:23:31 +00:00
Petar Jovanovic 7b3a38ec30 [X86] Correct dwarf unwind information in function epilogue
CFI instructions that set appropriate cfa offset and cfa register are now
inserted in emitEpilogue() in X86FrameLowering.

Majority of the changes in this patch:

1. Ensure that CFI instructions do not affect code generation.
2. Enable maintaining correct information about cfa offset and cfa register
in a function when basic blocks are reordered, merged, split, duplicated.

These changes are target independent and described below.

Changed CFI instructions so that they:

1. are duplicable
2. are not counted as instructions when tail duplicating or tail merging
3. can be compared as equal

Add information to each MachineBasicBlock about cfa offset and cfa register
that are valid at its entry and exit (incoming and outgoing CFI info). Add
support for updating this information when basic blocks are merged, split,
duplicated, created. Add a verification pass (CFIInfoVerifier) that checks
that outgoing cfa offset and register of predecessor blocks match incoming
values of their successors.

Incoming and outgoing CFI information is used by a late pass
(CFIInstrInserter) that corrects CFA calculation rule for a basic block if
needed. That means that additional CFI instructions get inserted at basic
block beginning to correct the rule for calculating CFA. Having CFI
instructions in function epilogue can cause incorrect CFA calculation rule
for some basic blocks. This can happen if, due to basic block reordering,
or the existence of multiple epilogue blocks, some of the blocks have wrong
cfa offset and register values set by the epilogue block above them.

Patch by Violeta Vukobrat.

Differential Revision: https://reviews.llvm.org/D18046

llvm-svn: 306529
2017-06-28 10:21:17 +00:00
Nikolai Bozhenov 77b5536e4e [ValueTracking] Enabling existing ValueTracking patch by default.
The original patch was an improvement to IR ValueTracking on non-negative
integers. It has been checked in to trunk (D18777, r284022). But was disabled by
default due to performance regressions.
Perf impact has improved. The patch would be enabled by default.

Reviewers: reames

Differential Revision: https://reviews.llvm.org/D34101

Patch by: Olga Chupina <olga.chupina@intel.com>

llvm-svn: 306528
2017-06-28 10:08:08 +00:00
Nikolai Bozhenov b01e6b5a52 [InstCombine] Canonicalize clamp of float types to minmax in fast mode.
Summary:
This commit allows matchSelectPattern to recognize clamp of float
arguments in the presence of FMF the same way as already done for
integers.

This case is a little different though. With integers, given the
min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX
"automatically". That is not the case for float, because for them only
full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care
about NaNs. On the other hand, some backends (e.g. X86) have only
FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM
nodes are illegal thus selection is not happening. So I decided to do
such kind of transformation in IR (InstCombiner) instead of
complicating the logic in the backend.

Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper

Reviewed By: efriedma

Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits

Patch by Andrei Elovikov <andrei.elovikov@intel.com>

Differential Revision: https://reviews.llvm.org/D33186

llvm-svn: 306525
2017-06-28 09:26:20 +00:00
George Rimar 1af3cb2912 Recommit "[ELF] - Add ability for DWARFContextInMemory to exit early when any error happen."
With fix in include folder character case:
#include "llvm/Codegen/AsmPrinter.h" -> #include "llvm/CodeGen/AsmPrinter.h"

Original commit message:

Change introduces error reporting policy for DWARFContextInMemory.
New callback provided by client is able to handle error on it's
side and return Halt or Continue.

That allows to either keep current behavior when parser prints all errors
but continues parsing object or implement something very different, like
stop parsing on a first error and report an error in a client style.

Differential revision: https://reviews.llvm.org/D34328

llvm-svn: 306517
2017-06-28 08:21:19 +00:00
Kristof Beyls eecb353d0e [ARM] Make -mcpu=generic schedule for an in-order core (Cortex-A8).
The benchmarking summarized in
http://lists.llvm.org/pipermail/llvm-dev/2017-May/113525.html showed
this is beneficial for a wide range of cores.

As is to be expected, quite a few small adaptations are needed to the
regressions tests, as the difference in scheduling results in:
- Quite a few small instruction schedule differences.
- A few changes in register allocation decisions caused by different
 instruction schedules.
- A few changes in IfConversion decisions, due to a difference in
 instruction schedule and/or the estimated cost of a branch mispredict.

llvm-svn: 306514
2017-06-28 07:07:03 +00:00
George Rimar 7a82cffd68 Revert r306512 "[ELF] - Add ability for DWARFContextInMemory to exit early when any error happen."
It broke BB:

[13/106] 13 0.022 Generating VCSRevision.h
[25/106] 24 1.209 Building CXX object unittests/DebugInfo/DWARF/CMakeFiles/DebugInfoDWARFTests.dir/DWARFDebugInfoTest.cpp.o
FAILED: unittests/DebugInfo/DWARF/CMakeFiles/DebugInfoDWARFTests.dir/DWARFDebugInfoTest.cpp.o 
/home/bb/bin/g++  -DGTEST_HAS_RTTI=0 -DLLVM_BUILD_GLOBAL_ISEL -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Iunittests/DebugInfo/DWARF -I../llvm-project/llvm/unittests/DebugInfo/DWARF -Iinclude -I../llvm-project/llvm/include -I../llvm-project/llvm/utils/unittest/googletest/include -I../llvm-project/llvm/utils/unittest/googlemock/include -fPIC -fvisibility-inlines-hidden -m32 -std=c++11 -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -ffunction-sections -fdata-sections -O3    -UNDEBUG  -Wno-variadic-macros -fno-exceptions -fno-rtti -MD -MT unittests/DebugInfo/DWARF/CMakeFiles/DebugInfoDWARFTests.dir/DWARFDebugInfoTest.cpp.o -MF unittests/DebugInfo/DWARF/CMakeFiles/DebugInfoDWARFTests.dir/DWARFDebugInfoTest.cpp.o.d -o unittests/DebugInfo/DWARF/CMakeFiles/DebugInfoDWARFTests.dir/DWARFDebugInfoTest.cpp.o -c ../llvm-project/llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp
../llvm-project/llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp:18:37: fatal error: llvm/Codegen/AsmPrinter.h: No such file or directory
 #include "llvm/Codegen/AsmPrinter.h"
                                     ^
compilation terminated.

llvm-svn: 306513
2017-06-28 07:06:17 +00:00
George Rimar 397a70425b [ELF] - Add ability for DWARFContextInMemory to exit early when any error happen.
Change introduces error reporting policy for DWARFContextInMemory.
New callback provided by client is able to handle error on it's
side and return Halt or Continue.

That allows to either keep current behavior when parser prints all errors
but continues parsing object or implement something very different, like
stop parsing on a first error and report an error in a client style.

Differential revision: https://reviews.llvm.org/D34328

llvm-svn: 306512
2017-06-28 06:57:20 +00:00
Max Kazantsev 6c466a376e [IRCE][NFC] Better get SCEV for 1 in calculateSubRanges
A slightly more efficient way to get constant, we avoid resolving in getSCEV and excessive
invocations, and we don't create a ConstantInt if 'true' branch is taken.

Differential Revision: https://reviews.llvm.org/D34672

llvm-svn: 306503
2017-06-28 04:57:45 +00:00
Nirav Dave c4ce2293b0 Revert "[DAG] Fold FrameIndex offset into BaseIndexOffset analysis. NFCI."
This reverts commit r306498 which appears to cause a compilrt-rt test failures

llvm-svn: 306501
2017-06-28 03:20:04 +00:00
Stanislav Mekhanoshin d445455643 [AMDGPU] Add pattern for v_alignbit_b32 with immediate
If immediate in shift is less than 32 we can use alignbit too.

Differential Revision: https://reviews.llvm.org/D34729

llvm-svn: 306500
2017-06-28 02:52:39 +00:00
Stanislav Mekhanoshin eb40733bf0 Allow to truncate left shift with non-constant shift amount
That is pretty common for clang to produce code like
(shl %x, (and %amt, 31)). In this situation we can still perform
trunc (shl) into shl (trunc) conversion given the known value
range of shift amount.

Differential Revision: https://reviews.llvm.org/D34723

llvm-svn: 306499
2017-06-28 02:37:11 +00:00
Nirav Dave 8ef03802f1 [DAG] Fold FrameIndex offset into BaseIndexOffset analysis. NFCI.
Pull FrameIndex comparision reasoning from DAGCombiner::isAlias to
general BaseIndexOffset.

llvm-svn: 306498
2017-06-28 02:09:50 +00:00
Kyle Butt f73c8a06a9 Inlining: Don't re-map simplified cloned instructions.
When simplifying an instruction that has been re-mapped, it should never
simplify to an instruction in the original function. In the edge case
where we are inlining a function into itself, the existing code led to
incorrect behavior. Replace the incorrect code with an assert verifying
that we never expect simplification to produce an instruction in the old
function, unless the functions are the same.

Differential Revision: https://reviews.llvm.org/D33850

llvm-svn: 306495
2017-06-28 01:41:25 +00:00
Peter Collingbourne 2d67e3789a Add missing library dependency.
llvm-svn: 306491
2017-06-28 00:05:27 +00:00
Mandeep Singh Grang 0c72172e32 [COFF, ARM64] Add support for Windows ARM64 COFF format
Summary:
This is the llvm part of the initial implementation to support Windows ARM64 COFF format.
I will gradually add more functionality in subsequent patches.

Reviewers: ruiu, rnk, t.p.northover, compnerd

Reviewed By: ruiu, compnerd

Subscribers: aemerson, mgorny, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D34705

llvm-svn: 306490
2017-06-27 23:58:19 +00:00
Peter Collingbourne 99b98c21f2 Object: Teach irsymtab::read() to try to use the irsymtab that we wrote to disk.
Fixes PR27551.

Differential Revision: https://reviews.llvm.org/D33974

llvm-svn: 306488
2017-06-27 23:50:24 +00:00
Peter Collingbourne 92648c25a4 Bitcode: Write the irsymtab to disk.
Differential Revision: https://reviews.llvm.org/D33973

llvm-svn: 306487
2017-06-27 23:50:11 +00:00
Peter Collingbourne 53ed867da8 Object: Add version and producer fields to the irsymtab header. NFCI.
These will be necessary in order to handle upgrades from old bitcode
files.

Differential Revision: https://reviews.llvm.org/D33972

llvm-svn: 306486
2017-06-27 23:49:58 +00:00
Sanjay Patel 4b23fa0abf [CGP] add specialization for memcmp expansion with only one basic block
llvm-svn: 306485
2017-06-27 23:15:01 +00:00
Easwaran Raman c5fa6358ba [NewPM/Inliner] Reduce threshold for cold callsites in the non-PGO case
Differential Revision: https://reviews.llvm.org/D34312

llvm-svn: 306484
2017-06-27 23:11:18 +00:00
Tim Northover c990236ff9 GlobalISel: add some more sanity-checking to MachineInstrBuilder. NFC.
llvm-svn: 306481
2017-06-27 22:45:35 +00:00
Florian Hahn 2665febb54 [AArch64] Inline callee if its target-features are a subset of the caller
Summary:
Similar to X86, it should be safe to inline callees if their target-features
are a subset of the caller. This change matches GCC's inlining behavior
with respect to attributes [1].

[1] https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes

Reviewers: kristof.beyls, javed.absar, rengolin, t.p.northover

Reviewed By: t.p.northover

Subscribers: aemerson, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D34698

llvm-svn: 306478
2017-06-27 22:27:32 +00:00
Geoff Berry 2573a19fe6 [EarlyCSE][MemorySSA] Enable MemorySSA in function-simplification pass of EarlyCSE.
llvm-svn: 306477
2017-06-27 22:25:02 +00:00
Aditya Nandakumar cca75d2406 [GISel]: Add G_FEXP, G_FEXP2 opcodes
Also add IRTranslator support.
https://reviews.llvm.org/D34710

llvm-svn: 306475
2017-06-27 22:19:32 +00:00
Rafael Espindola 650c96e0a7 clang-format a file.
It had a few inconsistent indentations that made a followup patch
hard to read.

llvm-svn: 306474
2017-06-27 22:14:20 +00:00
Dehao Chen 920d022519 re-commit r306336: Enable vectorizer-maximize-bandwidth by default.
Differential Revision: https://reviews.llvm.org/D33341

llvm-svn: 306473
2017-06-27 22:05:58 +00:00
Eugene Zelenko 4f820d0e01 [Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 306472
2017-06-27 21:52:05 +00:00
Sanjay Patel 70b36f193d [CGP] eliminate a sub instruction in memcmp expansion
As noted in D34071, there are some IR optimization opportunities that could be 
handled by normal IR passes if this expansion wasn't happening so late in CGP.

Regardless of that, it seems wasteful to knowingly produce suboptimal IR here, 
so I'm proposing this change:
  %s = sub i32 %x, %y
  %r = icmp ne %s, 0
    =>
  %r = icmp ne %x, %y

Changing the predicate to 'eq' mimics what InstCombine would do, so that's just
an efficiency improvement if we decide this expansion should happen sooner.

The fact that the PowerPC backend doesn't eliminate the 'subf.' might be 
something for PPC folks to investigate separately.

Differential Revision: https://reviews.llvm.org/D34416

llvm-svn: 306471
2017-06-27 21:46:34 +00:00
Tim Northover 849fcca090 GlobalISel: verify that a COPY is trivial when created.
Without this check, COPY instructions can actually be one of the generic casts
in disguise. That's confusing and bad.

At some point during ISel this restriction has to be relaxed since the fully
selected instructions will usually use COPY for those purposes. Right now I
think it's possible that relaxation occurs during RegBankSelect (hence the
change there). I'm not convinced that's where it belongs long-term though.

llvm-svn: 306470
2017-06-27 21:41:40 +00:00
Krzysztof Parzyszek 0b7688e6c0 Create a PHI value when merging with a known undef live-in
Differential Revision: https://reviews.llvm.org/D34640

llvm-svn: 306466
2017-06-27 21:30:46 +00:00
Joel Jones aea1c356e6 [AArch64] Performance enhancements for Cavium ThunderX2 T99
This patch enables significant performance enhancements to the
Cavium ThunderX2T99 LLVM backend, as observed by running SPEC2K6,
by adding more detailed scheduling information.

Related Bugzilla bug: http://bugs.llvm.org/show_bug.cgi?id=32562

Patch by: steleman

Differential Revision: https://reviews.llvm.org/D31801

llvm-svn: 306462
2017-06-27 20:44:55 +00:00
Sam Clegg 4df5d76477 [WebAssembly] Add support for printing relocations with llvm-objdump
Differential Revision: https://reviews.llvm.org/D34658

llvm-svn: 306461
2017-06-27 20:40:53 +00:00
Sam Clegg 9e1ade93a8 [WebAssembly] Add data size and alignement to linking section
The overal size of the data section (including BSS)
is otherwise not included in the wasm binary.

Differential Revision: https://reviews.llvm.org/D34657

llvm-svn: 306459
2017-06-27 20:27:59 +00:00
Krzysztof Parzyszek 25173e4cba [Hexagon] Use proper predicate register state when expanding PS_vselect
llvm-svn: 306458
2017-06-27 19:59:46 +00:00
Craig Topper 5fe0197622 [InstCombine] Propagate nsw flag when turning mul by pow2 into shift when the constant is a vector splat or the scalar bit width is larger than 64-bits
The check to see if we can propagate the nsw flag used m_ConstantInt(uint64_t*&) which doesn't work with splat vectors and has a restriction that the bitwidth of the ConstantInt must be 64-bits are less.

This patch changes it to use m_APInt to remove both these issues

Differential Revision: https://reviews.llvm.org/D34699

llvm-svn: 306457
2017-06-27 19:57:53 +00:00
Craig Topper ccbb810776 [Constants] Fix copy-pasto in llvm_unreachable message. NFC
llvm-svn: 306456
2017-06-27 19:57:51 +00:00
Sanjay Patel 352e60556e [CGP] simplify code to get bswap in memcmp expansion; NFCI
llvm-svn: 306452
2017-06-27 19:31:35 +00:00
Stanislav Mekhanoshin e8bf6c9629 [AMDGPU] Add 2 new alignbit patterns
Differential Revision: https://reviews.llvm.org/D34655

llvm-svn: 306449
2017-06-27 19:10:47 +00:00
Serge Guelton 7bc405aa4c [CodeExtractor] Prevent extraction of block involving blockaddress
BlockAddress are only valid within their function context, which does not
interact well with CodeExtractor. Detect this case and prevent it.

Differential Revision: https://reviews.llvm.org/D33839

llvm-svn: 306448
2017-06-27 18:57:53 +00:00
Stanislav Mekhanoshin c9bd53ab59 [AMDGPU] Simplify setcc (sext from i1 b), -1|0, cc
Depending on the compare code that can be either an argument of
sext or negate of it. This helps to avoid v_cndmask_b64 instruction
for sext. A reversed value can be further simplified and folded into
its parent comparison if possible.

Differential Revision: https://reviews.llvm.org/D34545

llvm-svn: 306446
2017-06-27 18:53:03 +00:00
Krzysztof Parzyszek 5ddd2e5899 [Hexagon] Update kills in hexagon-nvj even more properly than before
Account for the fact that both, the feeder and the compare can be moved
over instructions that kill registers.

llvm-svn: 306443
2017-06-27 18:37:16 +00:00
Matt Arsenault 836d786e86 RenameIndependentSubregs: Fix infinite loop
Apparently this replacement can really be substituting the
same as the original register. Avoid restarting the loop
when there's been no change in the register uses.

llvm-svn: 306441
2017-06-27 18:28:10 +00:00
Yaxun Liu 7c44f340de [SROA] Fix APInt size when alloca address space is not 0
SROA assumes alloca address space is 0, which causes assertion. This patch fixes that.

Differential Revision: https://reviews.llvm.org/D34104

llvm-svn: 306440
2017-06-27 18:26:06 +00:00
Stanislav Mekhanoshin 6851ddf942 [AMDGPU] Combine and x, (sext cc from i1) => select cc, x, 0
Also factored out function to check if a boolean is an already
deserialized value which does not require v_cndmask_b32 to be
loaded. Added binary logical operators to its check.

Differential Revision: https://reviews.llvm.org/D34500

llvm-svn: 306439
2017-06-27 18:25:26 +00:00
Sanjay Patel 9a4ce0cc1c [CGP] add an IR builder to memcmp expansion class instead of recreating it; NFCI
This was a clean-up suggestion from:
https://reviews.llvm.org/D34005

llvm-svn: 306438
2017-06-27 18:18:42 +00:00
Matthias Braun a6e77405d0 LiveRangeCalc: Slightly improve map usage; NFC
- DenseMap should be faster than std::map
- Use the `InsertRes = insert() if (!InsertRes.inserted)` pattern rather
  than the `if (!X.contains(...)) { X.insert(...); }` to save one map
  lookup.

llvm-svn: 306436
2017-06-27 18:05:26 +00:00