Commit Graph

153429 Commits

Author SHA1 Message Date
Krasimir Georgiev 3d55cef48b [AArch64] Silence unused variable warning in opt mode after r311533
llvm-svn: 311535
2017-08-23 08:40:22 +00:00
Sjoerd Meijer 24c98189ed [AArch64] ISel legalization debug messages. NFCI.
Debugging AArch64 instruction legalization and custom lowering is really an
unpleasant experience because it shows nodes that appear out of thin air.
In commit r311444, some debug messages have been added to SelectionDAG, the
target independent part, and this patch adds some AArch64 specific messages.

Differential Revision: https://reviews.llvm.org/D36964

llvm-svn: 311533
2017-08-23 08:18:37 +00:00
Alex Bradbury d5d559421f [Lanai] Remove dead functions from LanaiRegisterInfo
getEHExceptionRegister and getEHHandlerRegister are unused and were removed 
from most backends in rL192099. This patch removes them from Lanai.

Differential Revision: https://reviews.llvm.org/D36829

llvm-svn: 311531
2017-08-23 07:14:48 +00:00
Hiroshi Inoue dbb285ca51 Revert rL311526: [PowerPC] better instruction selection for OR (XOR) with a 32-bit immediate
This reverts commit rL311526 due to failures in some buildbot.

llvm-svn: 311530
2017-08-23 06:38:05 +00:00
Craig Topper a85f86225a [InstCombine] Remove unused argument. NFC
llvm-svn: 311529
2017-08-23 05:46:09 +00:00
Craig Topper a94069fb4c [InstCombine] Replace a simple matcher with a plain old dyn_cast. NFC
llvm-svn: 311528
2017-08-23 05:46:08 +00:00
Craig Topper 524c44f74e [InstCombine] Remove an unnecessary dyn_cast to Instruction and a switch over two opcodes. Just dyn_cast to the specific instruction classes individually. NFC
Change the helper methods to take the more specific class as well.

llvm-svn: 311527
2017-08-23 05:46:07 +00:00
Hiroshi Inoue c4449df1b0 [PowerPC] better instruction selection for OR (XOR) with a 32-bit immediate
On PPC64, OR (XOR) with a 32-bit immediate can be done with only two instructions, i.e. ori + oris.
But the current LLVM generates three or four instructions for this purpose (and also it clobbers one GPR).

This patch makes PPC backend generate ori + oris (xori + xoris) for OR (XOR) with a 32-bit immediate.

e.g. (x | 0xFFFFFFFF) should be

	ori 3, 3, 65535
	oris 3, 3, 65535

but LLVM generates without this patch

	li 4, 0
	oris 4, 4, 65535
	ori 4, 4, 65535
	or 3, 3, 4

Differential Revision: https://reviews.llvm.org/D34757

llvm-svn: 311526
2017-08-23 05:15:15 +00:00
Dean Michael Berris 0884b73220 [XRay][CodeGen] Use PIC-friendly code in XRay sleds; remove synthetic references in .text
Summary:
This change achieves two things:

  - Redefine the Custom Event handling instrumentation points emitted by
    the compiler to not require dynamic relocation of references to the
    __xray_CustomEvent trampoline.

  - Remove the synthetic reference we emit at the end of a function that
    we used to keep auxiliary sections alive in favour of SHF_LINK_ORDER
    associated with the section where the function is defined.

To achieve the custom event handling change, we've had to introduce the
concept of sled versioning -- this will need to be supported by the
runtime to allow us to understand how to turn on/off the new version of
the custom event handling sleds. That change has to land first before we
change the way we write the sleds.

To remove the synthetic reference, we rely on a relatively new linker
feature that preserves the sections that are associated with each other.
This allows us to limit the effects on the .text section of ELF
binaries.

Because we're still using absolute references that are resolved at
runtime for the instrumentation map (and function index) maps, we mark
these sections write-able. In the future we can re-define the entries in
the map to use relative relocations instead that can be statically
determined by the linker. That change will be a bit more invasive so we
defer this for later.

Depends on D36816.

Reviewers: dblaikie, echristo, pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36615

llvm-svn: 311525
2017-08-23 04:49:41 +00:00
Yonghong Song dc1dbf6ef3 bpf: add variants of -mcpu=# and support for additional jmp insns
-mcpu=# will support:
  . generic: the default insn set
  . v1: insn set version 1, the same as generic
  . v2: insn set version 2, version 1 + additional jmp insns
  . probe: the compiler will probe the underlying kernel to
           decide proper version of insn set.

We did not not use -mcpu=native since llc/llvm will interpret -mcpu=native
as the underlying hardware architecture regardless of -march value.

Currently, only x86_64 supports -mcpu=probe. Other architecture will
silently revert to "generic".

Also added -mcpu=help to print available cpu parameters.
llvm will print out the information only if there are at least one
cpu and at least one feature. Add an unused dummy feature to
enable the printout.

Examples for usage:
$ llc -march=bpf -mcpu=v1 -filetype=asm t.ll
$ llc -march=bpf -mcpu=v2 -filetype=asm t.ll
$ llc -march=bpf -mcpu=generic -filetype=asm t.ll
$ llc -march=bpf -mcpu=probe -filetype=asm t.ll
$ llc -march=bpf -mcpu=v3 -filetype=asm t.ll
'v3' is not a recognized processor for this target (ignoring processor)
...
$ llc -march=bpf -mcpu=help -filetype=asm t.ll
Available CPUs for this target:

  generic - Select the generic processor.
  probe   - Select the probe processor.
  v1      - Select the v1 processor.
  v2      - Select the v2 processor.

Available features for this target:

  dummy - unused feature.

Use +feature to enable a feature, or -feature to disable it.
For example, llc -mcpu=mycpu -mattr=+feature1,-feature2
...

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 311522
2017-08-23 04:25:57 +00:00
Matthias Braun d6c0868da5 Fix tail-merge-after-mbp test
The output of this test changed after the fix in r311520 to have
-run-pass=block-placement behave like it does in a normal pipeline.
Adjust the test.

llvm-svn: 311521
2017-08-23 03:49:53 +00:00
Matthias Braun 8426d1342d Add test case for r311511
This also changes the TailDuplicator to be configured explicitely
pre/post regalloc rather than relying on the isSSA() flag. This was
necessary to have `llc -run-pass` work reliably.

llvm-svn: 311520
2017-08-23 03:17:59 +00:00
Martell Malone cc82cdfffc NFC: fix ToolDrivers syntax and typo errors
infoTable -> InfoTable camelCase
Libtool Options #define offset

llvm-svn: 311517
2017-08-23 02:10:28 +00:00
George Karpenkov 0ac90d3f78 Update LLVM fuzzers to use the libFuzzer bundled with the compiler toolchain
Differential Revision: https://reviews.llvm.org/D37041

llvm-svn: 311515
2017-08-23 00:40:58 +00:00
George Karpenkov 218ea7f69c Remove llvm-pdbutil/fuzzer.
The code does not compile, is not maintained, and does not have a buildbot.

Differential Revision: https://reviews.llvm.org/D37032

llvm-svn: 311512
2017-08-23 00:02:10 +00:00
Matthias Braun 55bc9b3f9e TargetInstrInfo: Change duplicate() to work on bundles.
Adds infrastructure to clone whole instruction bundles rather than just
single instructions. This fixes a bug where tail duplication would
unbundle instructions while cloning.

This should unbreak the "Clang Stage 1: cmake, RA, with expensive checks
enabled" build on greendragon. The bot broke with r311139 hitting this
pre-existing bug.

A proper testcase will come next.

llvm-svn: 311511
2017-08-22 23:56:30 +00:00
Craig Topper 35189d5221 [SelectionDAG] Make ISD::isConstantSplatVector always return an element sized APInt.
This partially reverts r311429 in favor of making ISD::isConstantSplatVector do something not confusing. Turns out the only other user of it was also having to deal with the weird property of it returning a smaller size.

So rather than continue to deal with this quirk everywhere, just make the interface do something sane.

Differential Revision: https://reviews.llvm.org/D37039

llvm-svn: 311510
2017-08-22 23:54:13 +00:00
Craig Topper ec4b82571c [InstCombine] Remove check for sext of vector icmp from shouldOptimizeCast
Looks like for 'and' and 'or' we end up performing at least some of the transformations this is bocking in a round about way anyway.

For 'and sext(cmp1), sext(cmp2) we end up later turning it into 'select cmp1, sext(cmp2), 0'. Then we optimize that back to sext (and cmp1, cmp2). This is the same result we would have gotten if shouldOptimizeCast hadn't blocked it. We do something analogous for 'or'.

With this patch we allow that transformation to happen directly in foldCastedBitwiseLogic. And we now support the same thing for 'xor'. This is definitely opening up many other cases, but since we already went around it for some cases hopefully it's ok.

Differential Revision: https://reviews.llvm.org/D36213

llvm-svn: 311508
2017-08-22 23:40:15 +00:00
Jonas Devlieghere 4942a0b0f3 Revert "[llvm-dwarfdump] Print type names in DW_AT_type DIEs"
This reverts commit r311492.

llvm-svn: 311499
2017-08-22 21:59:46 +00:00
Jonas Devlieghere f456d1864d [llvm-dwarfdump] Print type names in DW_AT_type DIEs
This patch adds printing for DW_AT_type DIEs like it's currently already
the case for DW_AT_specification DIEs.

llvm-svn: 311492
2017-08-22 21:41:49 +00:00
Peter Collingbourne 001052a067 WholeProgramDevirt: Create bitcast to i8* at each virtual call site.
We can't reuse the llvm.assume instruction's bitcast because it may not
dominate every user of the vtable pointer.

Differential Revision: https://reviews.llvm.org/D36994

llvm-svn: 311491
2017-08-22 21:41:19 +00:00
Matt Morehouse b1fa8255db [SanitizerCoverage] Optimize stack-depth instrumentation.
Summary:
Use the initialexec TLS type and eliminate calls to the TLS
wrapper.  Fixes the sanitizer-x86_64-linux-fuzzer bot failure.

Reviewers: vitalybuka, kcc

Reviewed By: kcc

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37026

llvm-svn: 311490
2017-08-22 21:28:29 +00:00
Jakub Kuderski 2724d45325 [ADCE][Dominators] Reapply: Teach ADCE to preserve dominators
Summary:
This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees.

This is reapplies the original patch r311057 that was reverted in r311381.
The previous version wasn't using the batch update api for updating dominators,
which in vary rare cases caused assertion failures.

This also fixes PR34258.

Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki

Reviewed By: davide

Subscribers: grandinj, zhendongsu, llvm-commits, david2050

Differential Revision: https://reviews.llvm.org/D35869

llvm-svn: 311467
2017-08-22 16:30:21 +00:00
Jonas Devlieghere a680a8f5f8 [Debug info] Add new DbgValues after looping over DAG
I was contacted by Jesper Antonsson from Ericsson who ran into problems
with r311181 in their test suites with for an out-of-tree target.
Because of the latter I don't have a reproducer, but we definitely don't
want to modify the data structure on which we are iterating inside the
loop.

llvm-svn: 311466
2017-08-22 16:28:07 +00:00
Sanjay Patel 0ab50f6d68 [x86] auto-generate full checks; NFC
I don't see anything Darwin-specific here, so I made the target generic x86-64.

llvm-svn: 311465
2017-08-22 16:27:00 +00:00
Sanjay Patel 40b8e3bfe5 [x86] simplify runs and auto-generate full checks
I've replaced the two OS-specific runs with a generic run because
there's no functional difference in the resulting output that
we're checking. Also, the script still doesn't work with a Win
target.

llvm-svn: 311463
2017-08-22 16:21:45 +00:00
Erich Keane 0343ef8672 Emit section information for extern variables
Update IR generated to retain section information for external declarations. 
This is related to https://reviews.llvm.org/D36487

Patch By: eandrews
Differential Revision: https://reviews.llvm.org/D36712

llvm-svn: 311459
2017-08-22 15:30:43 +00:00
Sam Parker d65e19f7b3 [ARM][AArch64] Add Armv8.3-a unittests
Add Armv8.3-A to the architecture to the TargetParser unittests.

Differential Revision: https://reviews.llvm.org/D36748

llvm-svn: 311450
2017-08-22 12:46:33 +00:00
Sam Parker 6dc3fcb1c6 [ARM][AArch64] v8.3-A Javascript Conversion
Armv8.3-A adds instructions that convert a double-precision floating
point number to a signed 32-bit integer with round towards zero,
designed for improving Javascript performance.

Differential Revision: https://reviews.llvm.org/D36785

llvm-svn: 311448
2017-08-22 11:08:21 +00:00
Renato Golin c070c73d5e [ARM] Avoid creating duplicate ANDs in SelectionDAG
When expanding a BRCOND into a BR_CC, do not create an AND 1
if one already exists.

Review: D36705

Patch by Joel Galenson <jgalenson@google.com>

llvm-svn: 311447
2017-08-22 11:02:45 +00:00
Renato Golin f63d701669 [ARM] Call setBooleanContents(ZeroOrOneBooleanContent)
The ARM backend should call setBooleanContents so that it can
use known bits to make some optimizations.

Review: D35821

Patch by Joel Galenson <jgalenson@google.com>

llvm-svn: 311446
2017-08-22 11:02:37 +00:00
Sjoerd Meijer e0c933f5d6 [SelectionDAG] Add getNode debug messages
This adds debug messages to various functions that create new SDValue nodes.
This is e.g. useful to have during legalization, as otherwise it can prints
legalization info of nodes that did not appear in the dumps before.

Differential Revision: https://reviews.llvm.org/D36984

llvm-svn: 311444
2017-08-22 10:43:51 +00:00
Sjoerd Meijer b9de2b4871 [AArch64] Cleanup of HasFullFP16 argument. NFC.
This is a clean up of commit r311154; it's not necessary to pass HasFullFP16 as
an argument, instead just query the DAG.

Differential Revision: https://reviews.llvm.org/D36978

llvm-svn: 311438
2017-08-22 09:21:08 +00:00
Chandler Carruth b866178067 Fix a typo in r311435.
llvm-svn: 311437
2017-08-22 09:20:52 +00:00
Alex Bradbury 080f6976c0 Use report_fatal_error for unsupported calling conventions
The calling convention can be specified by the user in IR. Failing to support 
a particular calling convention isn't a programming error, and so relying on 
llvm_unreachable to catch and report an unsupported calling convention is not 
appropriate.

Differential Revision: https://reviews.llvm.org/D36830

llvm-svn: 311435
2017-08-22 09:11:41 +00:00
George Rimar 1e94ca115d [lib/Analysis] - Mark personality functions as live.
This is PR33245.

Case I am fixing is next:
Imagine we have 2 BC files, one defines and uses personality routine,
second has only declaration and also uses it.

Previously algorithm computing dead symbols (llvm::computeDeadSymbols) did
not know about personality routines and leaved them dead even if function that
has routine was live.

As a result thinLTOInternalizeAndPromoteGUID() method changed binding for
such symbol to local. Later when LLD tried to link these objects it failed
because one object had undefined global symbol for routine and second
object contained local definition instead of global.

Patch set the live root flag on the corresponding FunctionSummary
for personality routines when we build the per-module summaries
during the compile step.

Differential revision: https://reviews.llvm.org/D36834

llvm-svn: 311432
2017-08-22 08:50:56 +00:00
Craig Topper b49f0893b2 [X86] Prevent several calls to ISD::isConstantSplatVector from returning a narrower APInt than the original scalar type
ISD::isConstantSplatVector can shrink to the smallest splat width. But we don't check the size of the resulting APInt at all. This can cause us to misinterpret the results.

This patch just adds a flag to prevent the APInt from changing width.

Fixes PR34271.

Differential Revision: https://reviews.llvm.org/D36996

llvm-svn: 311429
2017-08-22 05:40:17 +00:00
Eric Beckmann 87c6acf38a Integrate manifest merging library into LLD.
Summary: Now that the llvm-mt manifest merging libraries are complete, we may use them to merge manifests instead of needing to shell out to mt.exe.

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D36255

llvm-svn: 311424
2017-08-22 03:15:28 +00:00
Adrian Prantl acdc3a7bff dsymutil: don't copy compile units without children from PCM files
rdar://problem/33830532

llvm-svn: 311416
2017-08-22 01:10:48 +00:00
George Karpenkov 748bf121bb Moving libFuzzer from LLVM to compiler-rt.
This change only removes libFuzzer tests and CMake machinery,
the source copy temporarily remains at the old location.

Differential Revision: https://reviews.llvm.org/D36980

llvm-svn: 311405
2017-08-21 23:25:12 +00:00
Justin Bogner 7d449d31a4 Re-apply "Introduce FuzzMutate library"
Same as r311392 with some fixes for library dependencies. Thanks to
Chapuni for helping work those out!

Original commit message:

This introduces the FuzzMutate library, which provides structured
fuzzing for LLVM IR, as described in my EuroLLVM 2017 talk. Most of
the basic mutators to inject and delete IR are provided, with support
for most basic operations.

llvm-svn: 311402
2017-08-21 22:57:06 +00:00
Quentin Colombet 4056e80719 [RegAlloc] Make sure live-ranges reflect the state of the IR when removing them
When removing a live-range we used to not touch them making debug
prints harder to read because the IR was not matching what the
live-ranges information was saying.

This only affects debug printing and allows to put stronger asserts in
the code (see r308906 for instance).

llvm-svn: 311401
2017-08-21 22:56:18 +00:00
Craig Topper 7227ebad9c [ValueTracking] Add assertions that the starting Depth in isKnownToBeAPowerOfTwo and ComputeNumSignBitsImpl is not above MaxDepth
The function does an equality check later to terminate the recursion, but that won't work if its starts out too high. Similar assert already exists in computeKnownBits.

llvm-svn: 311400
2017-08-21 22:56:12 +00:00
Sanjay Patel 6f527aae0b [InstCombine] add udiv/urem tests with constant numerator; NFC
llvm-svn: 311396
2017-08-21 22:40:02 +00:00
Justin Bogner 6e39755d84 Revert "Re-apply "Introduce FuzzMutate library""
The dependencies for the new library seem to be misconfigured on some
linux configs:

  http://bb.pgr.jp/builders/llvm-i686-linux-RA/builds/5435/steps/build_all/logs/stdio

This reverts r311392.

llvm-svn: 311393
2017-08-21 22:28:47 +00:00
Justin Bogner f5c8736482 Re-apply "Introduce FuzzMutate library"
Redo r311356 with a fix to avoid std::uniform_int_distribution<bool>.
The bool specialization is undefined according to the standard, even
though libc++ seems to have it.

Original commit message:

This introduces the FuzzMutate library, which provides structured
fuzzing for LLVM IR, as described in my [EuroLLVM 2017 talk][1]. Most
of the basic mutators to inject and delete IR are provided, with
support for most basic operations.

llvm-svn: 311392
2017-08-21 22:25:04 +00:00
Sanjay Patel 5e3037cfc4 [InstCombine] add more tests for udiv/urem narrowing; NFC
We don't currently limit these folds with hasOneUse() or shouldChangeType().

llvm-svn: 311390
2017-08-21 21:57:52 +00:00
Evandro Menezes bc11ca1a31 [AArch64] Restore the test of conditional branch fusion
Restore the functionality of this test that was broken by
https://reviews.llvm.org/rL306144.

Differential revision: https://reviews.llvm.org/D36807

llvm-svn: 311389
2017-08-21 21:57:43 +00:00
Tim Northover ef1fc5ae89 GlobalISel (AArch64): fix ABI at border between GPRs and SP.
If a struct would end up half in GPRs and half on SP the ABI says it should
actually go entirely on the stack. We were getting this wrong in GlobalISel
before, causing compatibility issues.

llvm-svn: 311388
2017-08-21 21:56:11 +00:00
Steven Wu 010fc49e42 [IR] AutoUpgrade ModuleFlagBehavior for PIC and PIE level
Summary:
From r303590, ModuleFlagBehavior for PIC and PIE level is changed from
Error to Max. This will cause bitcode compatibility issue when linking
against a bitcode static archive built with old compiler.
Add an auto-ugprade path to upgrade the the ModuleFlagBehavior in the
old bitcode to match the new one so IRLinker can link them.

Reviewers: tejohnson, mehdi_amini, dexonsmith

Reviewed By: dexonsmith

Subscribers: hans, llvm-commits

Differential Revision: https://reviews.llvm.org/D36556

llvm-svn: 311387
2017-08-21 21:49:13 +00:00
Craig Topper 775ffcc8f5 [InstCombine] Move the checks for pointer types in getMaskedTypeForICmpPair earlier in the function
I don't think there's any reason to have them scattered about and on all 4 operands. We already have an early check that both compares must be the same type. And within a given compare the LHS and RHS must have the same type. Beyond that I don't think there's anyway this function returns anything valid for pointer types. So let's just return early and be done with it.

Differential Revision: https://reviews.llvm.org/D36561

llvm-svn: 311383
2017-08-21 21:00:45 +00:00
Pirama Arumuga Nainar 3d48bb5fc2 [Support, Windows] Handle long paths with unix separators
Summary:
The function widenPath() for Windows also normalizes long path names by
iterating over the path's components and calling append().  The
assumption during the iteration that separators are not returned by the
iterator doesn't hold because the iterators do return a separator when
the path has a drive name.  Handle this case by ignoring separators
during iteration.

Reviewers: rnk

Subscribers: danalbert, srhines

Differential Revision: https://reviews.llvm.org/D36752

llvm-svn: 311382
2017-08-21 20:49:44 +00:00
Sanjoy Das 08a38fe71e Revert "Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators"
Summary: This partially reverts commit r311057 since it breaks ADCE.  See PR34258.

Reviewers: kuhar

Subscribers: mcrosier, david2050, llvm-commits

Differential Revision: https://reviews.llvm.org/D36979

llvm-svn: 311381
2017-08-21 20:39:18 +00:00
Sam Elliott 6f9a9b5769 [ORE] Remove Old Optimization Remark API
Summary: https://bugs.llvm.org/show_bug.cgi?id=33789

Reviewers: anemet

Reviewed By: anemet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36972

llvm-svn: 311380
2017-08-21 20:30:44 +00:00
Zachary Turner 5641c07d6b [PDB] Serialize records into a stack-allocated buffer.
We were using a std::vector<> and resizing to MaxRecordLength,
which is ~64KB.  We would then do this repeatedly often many
times in a tight loop, which was causing measurable performance
impact when linking PDBs.

Patch by Alex Telishev
Differential Revision: https://reviews.llvm.org/D36940

llvm-svn: 311375
2017-08-21 20:17:19 +00:00
George Karpenkov fb0994b37e Always compile libFuzzer with no coverage
Do not compile libFuzzer itself with coverage, regardless of LLVM variables

Differential Revision: https://reviews.llvm.org/D36887

llvm-svn: 311374
2017-08-21 20:12:58 +00:00
Zachary Turner d76dc2d31e [lld/pdb] Speed up construction of publics & globals addr map.
computeAddrMap function calls std::stable_sort with a comparison
function that computes deserialized symbols every time its called.
In the result deserializeAs<PublicSym32> is called 20-30 times per
symbol. It's much faster to calculate it beforehand and pass a
pointer to it to the comparison function.

Patch by Alex Telishev
Differential Revision: https://reviews.llvm.org/D36941

llvm-svn: 311373
2017-08-21 20:08:40 +00:00
Haicheng Wu 0812c5bea3 [InlineCost] Add cl::opt to allow full inline cost to be computed for debugging purposes.
Currently, the inline cost model will bail once the inline cost exceeds the
inline threshold in order to avoid unnecessary compile-time. However, when
debugging it is useful to compute the full cost, so this command line option
is added to override the default behavior.

I took over this work from Chad Rosier (mcrosier@codeaurora.org).

Differential Revision: https://reviews.llvm.org/D35850

llvm-svn: 311371
2017-08-21 20:00:09 +00:00
Chad Rosier 4eb18742ca [InlineCost] Add more debug during inline cost computation.
llvm-svn: 311370
2017-08-21 19:56:46 +00:00
Zachary Turner abc037927b [BinaryStream] Defaultify copy and move constructors.
The various BinaryStream classes had explicit copy constructors
which resulted in deleted move constructors.  This was causing
the internal std::shared_ptr to get copied rather than moved
very frequently, since these classes are often used as return
values.

Patch by Alex Telishev
Differential Revision: https://reviews.llvm.org/D36942

llvm-svn: 311368
2017-08-21 19:46:46 +00:00
Sanjay Patel 82ec872990 [LibCallSimplifier] try harder to fold memcmp with constant arguments (2nd try)
The 1st try was reverted because it could inf-loop by creating a dead instruction.
Fixed that to not happen and added a test case to verify.

Original commit message:

Try to fold:
memcmp(X, C, ConstantLength) == 0 --> load X == *C

Without this change, we're unnecessarily checking the alignment of the constant data,
so we miss the transform in the first 2 tests in the patch.

I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion
patches. This doesn't help the example in:
https://bugs.llvm.org/show_bug.cgi?id=34032#c13
...directly, but it's worth short-circuiting more of these simple cases since we're
already trying to do that.

The benefit of transforming to load+cmp is that existing IR analysis/transforms may
further simplify that code. For example, if the load of the variable is common to
multiple memcmp calls, CSE can remove the duplicate instructions.

Differential Revision: https://reviews.llvm.org/D36922

llvm-svn: 311366
2017-08-21 19:13:14 +00:00
Craig Topper 74177e1ed1 [InstCombine] Teach foldSelectICmpAnd to recognize a (icmp slt X, 0) and (icmp sgt X, -1) as equivalent to an and with the sign bit of the truncated type
This is similar to what was already done in foldSelectICmpAndOr. Ultimately I'd like to see if we can call foldSelectICmpAnd from foldSelectIntoOp if we detect a power of 2 constant. This would allow us to remove foldSelectICmpAndOr entirely.

Differential Revision: https://reviews.llvm.org/D36498

llvm-svn: 311362
2017-08-21 19:02:06 +00:00
Justin Bogner b5fb3b56d7 Revert "Introduce FuzzMutate library"
Looks like this fails to build with libstdc++.

This reverts r311356

llvm-svn: 311358
2017-08-21 17:57:12 +00:00
Justin Bogner 0233637085 Introduce FuzzMutate library
This introduces the FuzzMutate library, which provides structured
fuzzing for LLVM IR, as described in my [EuroLLVM 2017 talk][1]. Most
of the basic mutators to inject and delete IR are provided, with
support for most basic operations.

I will follow up with the instruction selection fuzzer, which is
implemented in terms of this library.

[1]: http://llvm.org/devmtg/2017-03//2017/02/20/accepted-sessions.html#2

llvm-svn: 311356
2017-08-21 17:44:36 +00:00
Sean Fertile 00393cce3a [PPC] Refine checks for emiting TOC restore nop and tail-call eligibility.
For the medium and large code models we only need to check if a call crosses
dso-boundaries when considering tail-call elgibility.

Differential Revision: https://reviews.llvm.org/D34245

llvm-svn: 311353
2017-08-21 17:35:32 +00:00
Sam Elliott e963c89d11 Migrate WholeProgramDevirt to new Optimization Remark API
Summary:
This is an attempt to move WholeProgramDevirt to the new remark API.

https://bugs.llvm.org/show_bug.cgi?id=33793

Reviewers: anemet

Reviewed By: anemet

Subscribers: fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D36943

llvm-svn: 311352
2017-08-21 16:57:21 +00:00
Davide Italiano 5a2530da05 [APFloat] Fix IsInteger() for DoubleAPFloat.
Previously, we would just assert instead.

Differential Revision:  https://reviews.llvm.org/D36961

llvm-svn: 311351
2017-08-21 16:51:54 +00:00
Sanjay Patel cf081f9a30 [InstCombine] add tests for memcmp with constant; NFC
This is the baseline (current) version of the tests that would
have been added with the transform in r311333 (reverted at 
r311340 due to inf-looping).

Adding these now to aid in testing and minimize the patch if/when
it is reinstated.

llvm-svn: 311350
2017-08-21 16:47:12 +00:00
Sam Elliott e604b563ea Emit only A Single Opt Remark When Inlining
Summary:
This updates the Inliner to only add a single Optimization
Remark when Inlining, rather than an Analysis Remark and an
Optimization Remark.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33786

Reviewers: anemet, davidxl, chandlerc

Reviewed By: anemet

Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D36054

llvm-svn: 311349
2017-08-21 16:45:47 +00:00
Craig Topper cc255bcd77 [InstCombine] Fix a weakness in canEvaluateZExtd around 'and' instructions
Summary:
If the bitsToClear from the LHS of an 'and' comes back non-zero, but all of those bits are known zero on the RHS, we can reset bitsToClear.

Without this, the 'or' in the modified test case blocks the transform because it has non-zero bits in its RHS in those bits.

Reviewers: spatel, majnemer, davide

Reviewed By: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36944

llvm-svn: 311343
2017-08-21 16:04:11 +00:00
Craig Topper 8078dd2984 [X86] When selecting sse_load_f32/f64 pattern, make sure there's only one use of every node all the way back to the root of the match
Summary: With masked operations, its possible for the operation node like fadd, fsub, etc. to be used by multiple different vselects. Since the pattern matching will start at the vselect, we need to make sure the operation node itself is only used once before we can fold a load. Otherwise we'll end up folding the same load into multiple instructions.

Reviewers: RKSimon, spatel, zvi, igorb

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36938

llvm-svn: 311342
2017-08-21 16:04:04 +00:00
Xinliang David Li d2838fc4b9 Revert 311208, 311209
llvm-svn: 311341
2017-08-21 16:00:38 +00:00
Sanjay Patel 707f786cc5 revert r311333: [LibCallSimplifier] try harder to fold memcmp with constant arguments
We're getting lots of compile-timeout bot failures like:
http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/7119
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux

llvm-svn: 311340
2017-08-21 15:16:25 +00:00
Sanjay Patel 0707434ce8 [InstCombine] add vector tests; NFC
llvm-svn: 311339
2017-08-21 15:11:39 +00:00
Zachary Turner d1de2f4f5e [llvm-pdbutil] Add support for dumping detailed module stats.
This adds support for dumping a summary of module symbols
and CodeView debug chunks.  This option prints a table for
each module of all of the symbols that occurred in the module
and the number of times it occurred and total byte size.  Then
at the end it prints the totals for the entire file.

Additionally, this patch adds the -jmc (just my code) option,
which suppresses modules which are from external libraries or
linker imports, so that you can focus only on the object files
and libraries that originate from your own source code.

llvm-svn: 311338
2017-08-21 14:53:25 +00:00
Sanjay Patel 48c67c9965 [InstCombine] regenerate test checks; NFC
llvm-svn: 311337
2017-08-21 14:34:06 +00:00
Sanjay Patel 7756edfa93 [LibCallSimplifier] try harder to fold memcmp with constant arguments
Try to fold:
memcmp(X, C, ConstantLength) == 0 --> load X == *C

Without this change, we're unnecessarily checking the alignment of the constant data, 
so we miss the transform in the first 2 tests in the patch.

I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion 
patches. This doesn't help the example in:
https://bugs.llvm.org/show_bug.cgi?id=34032#c13
...directly, but it's worth short-circuiting more of these simple cases since we're 
already trying to do that.

The benefit of transforming to load+cmp is that existing IR analysis/transforms may
further simplify that code. For example, if the load of the variable is common to 
multiple memcmp calls, CSE can remove the duplicate instructions.

Differential Revision: https://reviews.llvm.org/D36922

llvm-svn: 311333
2017-08-21 13:55:49 +00:00
Stefan Pintilie 9495f33e45 [PowerPC] Check if the pre-increment PHI Node already exists
Preparations to use the per-increment are sometimes done in the target
independent pass Loop Strength Reduction. We try to detect them in the PowerPC
specific pass so that they are not done twice and so that we do not add PHIs
that are not required.

Differential Revision: https://reviews.llvm.org/D36736

llvm-svn: 311332
2017-08-21 13:36:18 +00:00
Igor Breger 685889cf9b [GlobalISel][X86] Support G_BRCOND operation.
Summary: Support G_BRCOND operation. For now don't try to fold cmp/trunc instructions.

Reviewers: zvi, guyblank

Reviewed By: guyblank

Subscribers: rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D34754

llvm-svn: 311327
2017-08-21 10:51:54 +00:00
Oliver Stannard 9bd18aa7d8 [AsmParser] Recommit: Hash is not a comment on some targets
Re-committing after r311325 fixed an unintentional use of '#' comments in
clang.

The '#' token is not a comment for all targets (on ARM and AArch64 it marks an
immediate operand), so we shouldn't treat it as such.

Comments are already converted to AsmToken::EndOfStatement by
AsmLexer::LexLineComment, so this check was unnecessary.

Differential Revision: https://reviews.llvm.org/D36405

llvm-svn: 311326
2017-08-21 09:58:37 +00:00
Igor Breger 03c2208d5f [GlobalISel][X86] InstructionSelector, for now use fallback path for LOAD_STACK_GUARD and PHI nodes.
llvm-svn: 311323
2017-08-21 09:17:28 +00:00
Igor Breger 1b5e3d3e28 [GlobalISel][X86] LowerCall, for now don't handel ByValue function arguments.
llvm-svn: 311321
2017-08-21 08:59:59 +00:00
Michael Zuckerman bdb6673151 [InterLeaved] Adding lit test for future work interleaved load strid 3
llvm-svn: 311320
2017-08-21 08:56:39 +00:00
Chandler Carruth 98c51cbee1 [x86] Teach the "generic" x86 CPU to avoid patterns that are slow on
widely used processors.

This occured to me when I saw that we were generating 'inc' and 'dec'
when for Haswell and newer we shouldn't. However, there were a few "X is
slow" things that we should probably just set.

I've avoided any of the "X is fast" features because most of those would
be pretty serious regressions on processors where X isn't actually fast.
The slow things are likely to be negligible costs on processors where
these aren't slow and a significant win when they are slow.

In retrospect this seems somewhat obvious. Not sure why we didn't do
this a long time ago.

Differential Revision: https://reviews.llvm.org/D36947

llvm-svn: 311318
2017-08-21 08:45:22 +00:00
Chandler Carruth 63dd5e0ef6 [x86] Handle more cases where we can re-use an atomic operation's flags
rather than doing a separate comparison.

This both saves an explicit comparision and avoids the use of `xadd`
which introduces register constraints and other challenges to the
generated code.

The motivating case is from atomic reference counts where `1` is the
sentinel rather than `0` for whatever reason. This can and should be
lowered efficiently on x86 by just using a different flag, however the
x86 code only handled the `0` case.

There remains some further opportunities here that are currently hidden
due to canonicalization. I've included test cases that show these and
FIXMEs. However, I don't at the moment have any production use cases and
they seem substantially harder to address.

Differential Revision: https://reviews.llvm.org/D36945

llvm-svn: 311317
2017-08-21 08:45:19 +00:00
Sam Parker b252ffd2cc [ARM][AArch64] Cortex-A75 and Cortex-A55 support
This patch introduces support for Cortex-A75 and Cortex-A55, Arm's
latest big.LITTLE A-class cores. They implement the ARMv8.2-A
architecture, including the cryptography and RAS extensions, plus
the optional dot product extension. They also implement the RCpc
AArch64 extension from ARMv8.3-A.

Cortex-A75:
https://developer.arm.com/products/processors/cortex-a/cortex-a75

Cortex-A55:
https://developer.arm.com/products/processors/cortex-a/cortex-a55

Differential Revision: https://reviews.llvm.org/D36667

llvm-svn: 311316
2017-08-21 08:43:06 +00:00
George Rimar d7305ef06c [Support/Parallel] - Do not use a task group for a very small task.
parallel_for_each_n splits a given task into small pieces of tasks and then
passes them to background threads managed by a thread pool to process them
in parallel. TaskGroup then waits for all tasks to be done, which is done by
TaskGroup's destructor.

In the previous code, all tasks were passed to background threads, and the
main thread just waited for them to finish their jobs. This patch changes
the logic so that the main thread processes a task just like other
worker threads instead of just waiting for workers.

This patch improves the performance of parallel_for_each_n for a task which
is too small that we do not split it into multiple tasks. Previously, such task
was submitted to another thread and the main thread waited for its completion.
That involves multiple inter-thread synchronization which is not cheap for
small tasks. Now, such task is processed by the main thread, so no inter-thread
communication is necessary.

Differential revision: https://reviews.llvm.org/D36607

llvm-svn: 311312
2017-08-21 08:00:54 +00:00
Coby Tayree c54c5cbe67 [X86] Allow xacquire/xrelease prefixes
Allow those prefixes on assembly code
Differential Revision: https://reviews.llvm.org/D36845

llvm-svn: 311309
2017-08-21 07:50:15 +00:00
Craig Topper d6f4be97e6 [AVX-512] Don't change which instructions we use for unmasked subvector broadcasts when AVX512DQ is enabled.
There's no functional difference between the AVX512DQ instructions if we're not masking.

This change unifies test checks and removes extra isel entries. Similar was done for subvector insert and extracts recently.

llvm-svn: 311308
2017-08-21 05:29:02 +00:00
Craig Topper 485cca1ecb [AVX512] Add 128->256 vbroadcastf64x2/vbroadcasti64x2 instructions to the EVEX->VEX table.
llvm-svn: 311307
2017-08-21 05:03:28 +00:00
Dean Michael Berris c5caf3e9c6 [XRay][tools] Support new kinds of instrumentation map entries
Summary:
When extracting the instrumentation map from a binary, we should be able
to recognize the new kinds of instrumentation sleds we've been emitting
with the compiler using -fxray-instrument. This change adds a test for
all the kinds of sleds we currently support (sans the tail-call sled,
which is a bit harder to force in a simple prebuilt input).

Reviewers: kpw, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36819

llvm-svn: 311305
2017-08-21 00:14:06 +00:00
Chandler Carruth bd6dc14230 Revert r311077: [LV] Using VPlan ...
This causes LLVM to assert fail on PPC64 and crash / infloop in other
cases. Filed http://llvm.org/PR34248 with reproducer attached.

llvm-svn: 311304
2017-08-20 23:17:11 +00:00
Craig Topper a152903c1b [InstCombine] Add a test case for a weakness in canEvaluateZExtd. NFC
llvm-svn: 311303
2017-08-20 21:38:28 +00:00
Craig Topper d63b33f9c4 [AVX512] Add a test to check what happens when a load is referenced by two different masked scalar intrinsics with the same op inputs, but different masking node.
We're missing some single use checks in the sse_load_f32/f64 handling that cause us to replicate the load.

llvm-svn: 311300
2017-08-20 19:47:00 +00:00
Kuba Mracek 6734671dda Fix archive-update.test after r311296.
llvm-svn: 311299
2017-08-20 18:31:30 +00:00
Craig Topper 702097dafc [AVX-512] Use a scalar load pattern for FPCLASSSS/FPCLASSSD patterns.
llvm-svn: 311297
2017-08-20 18:30:24 +00:00
Kuba Mracek 2c0bca49b1 Remove uses of "%T" from test/Object/archive-* tests.
llvm-svn: 311296
2017-08-20 18:18:44 +00:00
Benjamin Kramer 806ae44012 [NVPTX] Reduce copypasta.
No functionality change intended.

llvm-svn: 311295
2017-08-20 17:30:32 +00:00
Kuba Mracek d3f3fae32d Get rid of even more "%T" expansions, see <https://reviews.llvm.org/D35396>.
llvm-svn: 311294
2017-08-20 17:05:22 +00:00
Kuba Mracek 5c393d2565 Get rid of some more "%T" expansions, see <https://reviews.llvm.org/D35396>.
llvm-svn: 311293
2017-08-20 17:00:08 +00:00
Benjamin Kramer 760e00b0bc [MachO] Use Twines more efficiently.
llvm-svn: 311291
2017-08-20 15:13:39 +00:00
Benjamin Kramer 2e5be849cc [Mem2Reg] Modernize code a bit.
No functionality change intended.

llvm-svn: 311290
2017-08-20 14:34:44 +00:00
Benjamin Kramer 49a49fe816 Move helper classes into anonymous namespaces.
No functionality change intended.

llvm-svn: 311288
2017-08-20 13:03:48 +00:00
Benjamin Kramer df8c2628ac [dlltool] Make memory buffer ownership less weird.
There's no reason to destroy them in a global destructor.

llvm-svn: 311287
2017-08-20 13:03:32 +00:00
Elena Demikhovsky f58f838495 Changed basic cost of store operation on X86
Store operation takes 2 UOps on X86 processors. The exact cost calculation affects several optimization passes including loop unroling.
This change compensates performance degradation caused by https://reviews.llvm.org/D34458 and shows improvements on some benchmarks.

Differential Revision: https://reviews.llvm.org/D35888

llvm-svn: 311285
2017-08-20 12:34:29 +00:00
Aditya Kumar a525fffd07 [Loop Vectorize] Added a separate metadata
Added a separate metadata to indicate when the loop
has already been vectorized instead of setting width and count to 1.

Patch written by Divya Shanmughan and Aditya Kumar

Differential Revision: https://reviews.llvm.org/D36220

llvm-svn: 311281
2017-08-20 10:32:41 +00:00
Igor Breger 88a3d5c855 [GlobalISel][X86] Support call ABI.
Summary: Support call ABI. For now only Linux C and X86_64_SysV calling conventions supported. Variadic function not supported.

Reviewers: zvi, guyblank, oren_ben_simhon

Reviewed By: oren_ben_simhon

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34602

llvm-svn: 311279
2017-08-20 09:25:22 +00:00
Igor Breger b3a860a5e8 [GlobalISel][X86] Support asimetric copy from/to GPR physical register.
Usually this case generated by ABI lowering, it requare to performe trancate/anyext.

llvm-svn: 311278
2017-08-20 07:14:40 +00:00
Alex Bradbury 21c8adf50b [RISCV] Trivial whitespace fix in RISCVInstPrinter
llvm-svn: 311277
2017-08-20 06:58:43 +00:00
Alex Bradbury e45186d43f [RISCV] Fix two abuses of llvm_unreachable
Replace with report_fatal_error.

llvm-svn: 311276
2017-08-20 06:57:27 +00:00
Alex Bradbury dd83484ab4 [RISCV] Set HasRelocationAddend for RISCVELFObjectWriter
llvm-svn: 311275
2017-08-20 06:55:14 +00:00
Sam Elliott 7fe0aaa140 Revert "Emit only A Single Opt Remark When Inlining"
Reverting due to clang build failure

llvm-svn: 311274
2017-08-20 06:55:10 +00:00
Sam Elliott 785dd75369 Emit only A Single Opt Remark When Inlining
Summary:
This updates the Inliner to only add a single Optimization
Remark when Inlining, rather than an Analysis Remark and an
Optimization Remark.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33786

Reviewers: anemet, davidxl, chandlerc

Reviewed By: anemet

Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D36054

llvm-svn: 311273
2017-08-20 06:43:34 +00:00
Igor Breger d88dfd32f8 [GlobalIsel] Fix undefined behavior if Action not set (release), it aslo crashing in debug mode.
Differential Revision: https://reviews.llvm.org/D34978

llvm-svn: 311272
2017-08-20 06:26:22 +00:00
Sam Elliott b0c9753691 Keep Optimization Remark Yaml in NewPM
Summary:
The New Pass Manager infrastructure was forgetting to keep around the optimization remark yaml file that the compiler might have been producing. This meant setting the option to '-' for stdout worked, but setting it to a filename didn't give file output (presumably it was deleted because compilation didn't explicitly keep it). This change just ensures that the file is kept if compilation succeeds.

So far I have updated one of the optimization remark output tests to add a version with the new pass manager. It is my intention for this patch to also include changes to all tests that use `-opt-remark-output=` but I wanted to get the code patch ready for review while I was making all those changes.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33951

Reviewers: anemet, chandlerc

Reviewed By: anemet, chandlerc

Subscribers: javed.absar, chandlerc, fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D36906

llvm-svn: 311271
2017-08-20 01:30:45 +00:00
Chandler Carruth 9ef881efab [x86] Fix an even stranger corner case where we have multiple levels of
cmov self-refrencing.

Pointed out by Amjad Aboud in code review, test case minorly simplified
from the one he posted.

llvm-svn: 311267
2017-08-19 23:35:50 +00:00
Craig Topper 4de6f583da [X86] Merge all of the vecload and alignedload predicates into single predicates.
We can load the memory VT and check for natural alignment. This also adds a new preferNonTemporalLoad helper that checks the correct subtarget feature based on the load size.

This shrinks the isel table by at least 5000 bytes by allowing more reordering and combining to occur.

llvm-svn: 311266
2017-08-19 23:21:22 +00:00
Craig Topper afa69eecbb [X86] Converge alignedstore/alignedstore256/alignedstore512 to a single predicate.
We can read the memoryVT and get its store size directly from the SDNode to check its alignment.

llvm-svn: 311265
2017-08-19 23:21:21 +00:00
Craig Topper a0319bb434 [AVX512] Use alignedstore256 in a pattern that's emitting a 256-bit movaps from an extract subvector operation.
llvm-svn: 311263
2017-08-19 22:02:02 +00:00
Victor Leschuk ee3292c5e4 Set init value for ScalarEvolution::BackedgeTakenInfo::MaxOrZero
Otherwise it can be used uninitialized in move ctor.

llvm-svn: 311262
2017-08-19 21:05:08 +00:00
Martin Storsjo d606d2a643 [ARM] Factorize the calculation of WhichResult in isV*Mask. NFC.
Differential Revision: https://reviews.llvm.org/D36930

llvm-svn: 311260
2017-08-19 20:26:51 +00:00
Martin Storsjo 91522ffa12 [ARM] Check the right order for halves of VZIP/VUZP if both parts are used
This is the exact same fix as in SVN r247254. In that commit, the fix was
applied only for isVTRNMask and isVTRN_v_undef_Mask, but the same issue
is present for VZIP/VUZP as well.

This fixes PR33921.

Differential Revision: https://reviews.llvm.org/D36899

llvm-svn: 311258
2017-08-19 19:47:48 +00:00
Teresa Johnson b225ad05af Fix bot failures by requiring x86 target
The tests added in r311254 require a target triple since they are
running through code generation. Fix bot failures by requiring
an x86 target.

llvm-svn: 311257
2017-08-19 19:15:04 +00:00
Konstantin Zhuravlyov 89377c440c AMDGPU/NFC: Reorder functions in SIMemoryLegalizer:
- Move *load* functions before *atomic* functions
  - Move *store* functions before *atomic* functions

llvm-svn: 311256
2017-08-19 18:44:27 +00:00
Jatin Bhateja 6b4c205685 [DAGCombiner] Extending pattern detection for vector shuffle.
Summary:
    If all the operands of a BUILD_VECTOR extract elements from same vector then split the
    vector efficiently based on the maximum vector access index.

    Reviewers: zvi, delena, RKSimon, thakis

    Reviewed By: RKSimon

    Subscribers: chandlerc, eladcohen, llvm-commits

    Differential Revision: https://reviews.llvm.org/D35788

llvm-svn: 311255
2017-08-19 18:08:59 +00:00
Teresa Johnson 73305f82e9 [ThinLTO] Fix ThinLTO crash
Summary:
Follow up to fix in r311023, which fixed the case where the combined
index is written to disk. The same samplePGO logic exists for the
in-memory index when computing imports, so we need to filter out
GlobalVariable summaries there too.

Reviewers: davidxl

Subscribers: inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D36919

llvm-svn: 311254
2017-08-19 18:04:25 +00:00
Craig Topper 6e70f7cd33 [X86] Remove an unnecessary alignment restriction from MOVDDUP pattern.
The SSE MOVDDUP instruction only loads 64-bits with no alignment restriction.

llvm-svn: 311253
2017-08-19 18:02:28 +00:00
Jatin Bhateja 66f7958e91 Revert rL311247 : To rectify commit message.
Summary: This reverts commit rL311247.

Differential Revision: https://reviews.llvm.org/D36927

llvm-svn: 311252
2017-08-19 17:59:58 +00:00
Jatin Bhateja 6f0d0d23b0 Merge branch 'arcpatch-D35788'
llvm-svn: 311247
2017-08-19 17:00:04 +00:00
Jatin Bhateja 1c56863739 Revert rL311242 "Extension of shuffle vector pattern detection, updating post rebase."
Summary:

This reverts commit rL311242.

Differential Revision: https://reviews.llvm.org/D36924

llvm-svn: 311246
2017-08-19 16:40:06 +00:00
Jatin Bhateja 313f97dd84 Extension of shuffle vector pattern detection, updating post rebase.
llvm-svn: 311242
2017-08-19 15:58:36 +00:00
Victor Leschuk ee7d232a41 revert failing test
llvm-svn: 311238
2017-08-19 12:24:41 +00:00
Victor Leschuk ba0954c4e2 Add temporary test to verify that win10 builder hangs on error
llvm-svn: 311236
2017-08-19 12:02:39 +00:00
Victor Leschuk 59dc64f3af Temporary mark lit :: shtest-format as unsupported on windows
When run manually it fails, but when run under buildbot it causes hang.

llvm-svn: 311230
2017-08-19 07:58:07 +00:00
Chandler Carruth 4f3aa29a46 [Inliner] Fix a nasty bug when inlining a non-recursive trace of
a function into itself.

We tried to fix this before in r306495 but that got reverted as the
assert was actually hit.

This fixes the original bug (which we seem to have lost track of with
the revert) by blocking a second remapping when the function being
inlined is also the caller and the remapping could succeed but
erroneously.

The included test case would actually load from an inlined copy of the
alloca before this change, failing to load the stored value and
miscompiling.

Many thanks to Richard Smith for diagnosing a user miscompile to this
bug, and to Kyle for the first attempt and initial analysis and David Li
for remembering the issue and how to fix it and suggesting the patch.
I'm just stitching it together and landing it. =]

llvm-svn: 311229
2017-08-19 06:56:11 +00:00
Chandler Carruth 2a80fddf67 [Inliner] Clean up a test case a bit to make it more clear what is being
tested and why.

llvm-svn: 311228
2017-08-19 06:06:44 +00:00
Chandler Carruth 1f8212597d [SLP] Fix an unused variable warning in non-asserts builds.
llvm-svn: 311227
2017-08-19 05:06:23 +00:00
Chandler Carruth 93a645525c [x86] Teach the cmov converter to aggressively convert cmovs with memory
operands into control flow.

We have seen periodically performance problems with cmov where one
operand comes from memory. On modern x86 processors with strong branch
predictors and speculative execution, this tends to be much better done
with a branch than cmov. We routinely see cmov stalling while the load
is completed rather than continuing, and if there are subsequent
branches, they cannot be speculated in turn.

Also, in many (even simple) cases, macro fusion causes the control flow
version to be fewer uops.

Consider the IACA output for the initial sequence of code in a very hot
function in one of our internal benchmarks that motivates this, and notice the
micro-op reduction provided.
Before, SNB:
```
Throughput Analysis Report
--------------------------
Block Throughput: 2.20 Cycles       Throughput Bottleneck: Port1

| Num Of |              Ports pressure in cycles               |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |    |
---------------------------------------------------------------------
|   1    |           | 1.0 |           |           |     |     | CP | mov rcx, rdi
|   0*   |           |     |           |           |     |     |    | xor edi, edi
|   2^   | 0.1       | 0.6 | 0.5   0.5 | 0.5   0.5 |     | 0.4 | CP | cmp byte ptr [rsi+0xf], 0xf
|   1    |           |     | 0.5   0.5 | 0.5   0.5 |     |     |    | mov rax, qword ptr [rsi]
|   3    | 1.8       | 0.6 |           |           |     | 0.6 | CP | cmovbe rax, rdi
|   2^   |           |     | 0.5   0.5 | 0.5   0.5 |     | 1.0 |    | cmp byte ptr [rcx+0xf], 0x10
|   0F   |           |     |           |           |     |     |    | jb 0xf
Total Num Of Uops: 9
```
After, SNB:
```
Throughput Analysis Report
--------------------------
Block Throughput: 2.00 Cycles       Throughput Bottleneck: Port5

| Num Of |              Ports pressure in cycles               |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |    |
---------------------------------------------------------------------
|   1    | 0.5       | 0.5 |           |           |     |     |    | mov rax, rdi
|   0*   |           |     |           |           |     |     |    | xor edi, edi
|   2^   | 0.5       | 0.5 | 1.0   1.0 |           |     |     |    | cmp byte ptr [rsi+0xf], 0xf
|   1    | 0.5       | 0.5 |           |           |     |     |    | mov ecx, 0x0
|   1    |           |     |           |           |     | 1.0 | CP | jnbe 0x39
|   2^   |           |     |           | 1.0   1.0 |     | 1.0 | CP | cmp byte ptr [rax+0xf], 0x10
|   0F   |           |     |           |           |     |     |    | jnb 0x3c
Total Num Of Uops: 7
```
The difference even manifests in a throughput cycle rate difference on Haswell.
Before, HSW:
```
Throughput Analysis Report
--------------------------
Block Throughput: 2.00 Cycles       Throughput Bottleneck: FrontEnd

| Num Of |                    Ports pressure in cycles                     |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |  6  |  7  |    |
---------------------------------------------------------------------------------
|   0*   |           |     |           |           |     |     |     |     |    | mov rcx, rdi
|   0*   |           |     |           |           |     |     |     |     |    | xor edi, edi
|   2^   |           |     | 0.5   0.5 | 0.5   0.5 |     | 1.0 |     |     |    | cmp byte ptr [rsi+0xf], 0xf
|   1    |           |     | 0.5   0.5 | 0.5   0.5 |     |     |     |     |    | mov rax, qword ptr [rsi]
|   3    | 1.0       | 1.0 |           |           |     |     | 1.0 |     |    | cmovbe rax, rdi
|   2^   | 0.5       |     | 0.5   0.5 | 0.5   0.5 |     |     | 0.5 |     |    | cmp byte ptr [rcx+0xf], 0x10
|   0F   |           |     |           |           |     |     |     |     |    | jb 0xf
Total Num Of Uops: 8
```
After, HSW:
```
Throughput Analysis Report
--------------------------
Block Throughput: 1.50 Cycles       Throughput Bottleneck: FrontEnd

| Num Of |                    Ports pressure in cycles                     |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |  6  |  7  |    |
---------------------------------------------------------------------------------
|   0*   |           |     |           |           |     |     |     |     |    | mov rax, rdi
|   0*   |           |     |           |           |     |     |     |     |    | xor edi, edi
|   2^   |           |     | 1.0   1.0 |           |     | 1.0 |     |     |    | cmp byte ptr [rsi+0xf], 0xf
|   1    |           | 1.0 |           |           |     |     |     |     |    | mov ecx, 0x0
|   1    |           |     |           |           |     |     | 1.0 |     |    | jnbe 0x39
|   2^   | 1.0       |     |           | 1.0   1.0 |     |     |     |     |    | cmp byte ptr [rax+0xf], 0x10
|   0F   |           |     |           |           |     |     |     |     |    | jnb 0x3c
Total Num Of Uops: 6
```

Note that this cannot be usefully restricted to inner loops. Much of the
hot code we see hitting this is not in an inner loop or not in a loop at
all. The optimization still remains effective and indeed critical for
some of our code.

I have run a suite of internal benchmarks with this change. I saw a few
very significant improvements and a very few minor regressions,
but overall this change rarely has a significant effect. However, the
improvements were very significant, and in quite important routines
responsible for a great deal of our C++ CPU cycles. The gains pretty
clealy outweigh the regressions for us.

I also ran the test-suite and SPEC2006. Only 11 binaries changed at all
and none of them showed any regressions.

Amjad Aboud at Intel also ran this over their benchmarks and saw no
regressions.

Differential Revision: https://reviews.llvm.org/D36858

llvm-svn: 311226
2017-08-19 05:01:19 +00:00
Chandler Carruth e3b3547e9f [x86] Refactor the CMOV conversion pass to be more flexible.
The primary thing that this accomplishes is to allow future re-use of
these routines in more contexts and clarify the behavior w.r.t. loops.
For example, if handling outer loops is desirable, doing so in
a inside-out order becomes straight forward because it walks the loop
nest itself (rather than walking the function's basic blocks) and
de-couples the CMOV rewriting from the loop structure as there isn't
actually anything loop-specific about this transformation.

This patch should be essentially a no-op. It potentially changes the
order in which we visit the inner loops, but otherwise should merely set
the stage for subsequent changes.

Differential Revision: https://reviews.llvm.org/D36783

llvm-svn: 311225
2017-08-19 04:28:20 +00:00
Dinar Temirbulatov 7aff8cfa55 [SLPVectorizer] Tighten up VLeft, VRight declaration, remove unnecessary testcase test/Transforms/SLPVectorizer/X86/reorder.ll, NFCI.
llvm-svn: 311223
2017-08-19 03:15:07 +00:00
Dinar Temirbulatov e3ce1b455e [SLPVectorizer] Add opcode parameter to reorderAltShuffleOperands, reorderInputsAccordingToOpcode functions.
Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D36766

llvm-svn: 311221
2017-08-19 02:54:20 +00:00
Matthias Braun 91bd3ad128 ARMRegsiterInfo: Define more ssub indexes; NFC
This doesn't really change anything as Tablegen would have inferred
those indices anyway; defining them gives us shorter names that are
easier to read while debugging (i.e. "ssub_4" rather than
"dsub2_then_ssub_0")

llvm-svn: 311218
2017-08-19 01:21:11 +00:00
Adrian Prantl 2116dd360a Filter out non-constant DIGlobalVariableExpressions reachable via the CU
They won't affect the DWARF output, but they will mess with the
sorting of the fragments. This fixes the crash reported in PR34159.

https://bugs.llvm.org/show_bug.cgi?id=34159

llvm-svn: 311217
2017-08-19 01:15:06 +00:00
Eric Beckmann 91d8af5386 llvm-mt: Merge manifest namespaces.
mt.exe performs a tree merge where certain element nodes are combined
into one.  This introduces the possibility of xml namespaces conflicting
with each other.  The original mt.exe has a hierarchy whereby certain
namespace names can override others, and nodes that would then end up in
ambigious namespaces have their namespaces explicitly defined.  This
namespace handles this merging process.

llvm-svn: 311215
2017-08-19 00:37:41 +00:00
Eugene Zelenko be709f2c19 [Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 311212
2017-08-18 23:51:26 +00:00
Xinliang David Li 0d07f9d68a Fix comment /NFC
llvm-svn: 311209
2017-08-18 23:08:50 +00:00
Xinliang David Li 709ffe178e [Profile] backward propagate profile info in JumpThreading
Differential Revsion: http://reviews.llvm.org/D36864

llvm-svn: 311208
2017-08-18 23:00:05 +00:00
Amjad Aboud 88ffa3afe2 [InstCombine] Teach ComputeNumSignBitsImpl to handle integer multiply instruction.
Differential Revision: https://reviews.llvm.org/D36679

llvm-svn: 311206
2017-08-18 22:56:55 +00:00
Max Kazantsev 0aaf8c16ac [IRCE] Fix buggy behavior in Clamp
Clamp function was too optimistic when choosing signed or unsigned min/max function for calculations.
In fact, `!IsSignedPredicate` guarantees us that `Smallest` and `Greatest` can be compared safely using unsigned
predicates, but we did not check this for `S` which can in theory be negative.

This patch makes Clamp use signed min/max for cases when it fails to prove `S` being non-negative,
and it adds a test where such situation may lead to incorrect conditions calculation.

Differential Revision: https://reviews.llvm.org/D36873

llvm-svn: 311205
2017-08-18 22:50:29 +00:00
Justin Bogner b29bebe47b IR: Make stripDebugInfo robust against (invalid) empty basic blocks
Since stripDebugInfo runs before the verifier when reading IR, we can
end up in a situation where we read some invalid IR but don't know its
invalid yet. Before this patch we would crash in stripDebugInfo when
given IR with a completely empty basic block, and after we get a nice
error from the verifier instead.

llvm-svn: 311202
2017-08-18 21:38:03 +00:00
Jonas Devlieghere a2faf7b60f [llvm-dwarfdump] Hide .debug_str and DIE reference offsets in brief mode
This patch hides the .debug_str offset and DIE reference offsets into
the CU when llvm-dwarfdump is invoked with -brief.

Differential Revision: https://reviews.llvm.org/D36835

llvm-svn: 311201
2017-08-18 21:35:44 +00:00
Simon Pilgrim f36cca88fb [X86][ADX] Regenerate ADX intrinsics tests
llvm-svn: 311198
2017-08-18 21:21:14 +00:00
Sanjay Patel 7046cbd691 fix typos in comments; NFC
llvm-svn: 311193
2017-08-18 20:27:47 +00:00
Ana Pazos 6210f27dfc [PGO] Fixed assertion due to mismatched memcpy size type.
Summary:
Memcpy intrinsics have size argument of any integer type, like i32 or i64.
Fixed size type along with its value when cloning the intrinsic.

Reviewers: davidxl, xur

Reviewed By: davidxl

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D36844

llvm-svn: 311188
2017-08-18 19:17:08 +00:00
Tim Northover 14302fcb24 ARM: use an external relocation for calls from MachO ARM mode.
The internal (__text-relative) relocation risks the offset not being encodable
if the destination is Thumb.

llvm-svn: 311187
2017-08-18 19:13:56 +00:00
Matt Morehouse 5c7fc76983 [SanitizerCoverage] Add stack depth tracing instrumentation.
Summary:
Augment SanitizerCoverage to insert maximum stack depth tracing for
use by libFuzzer.  The new instrumentation is enabled by the flag
-fsanitize-coverage=stack-depth and is compatible with the existing
trace-pc-guard coverage.  The user must also declare the following
global variable in their code:
  thread_local uintptr_t __sancov_lowest_stack

https://bugs.llvm.org/show_bug.cgi?id=33857

Reviewers: vitalybuka, kcc

Reviewed By: vitalybuka

Subscribers: kubamracek, hiraditya, cfe-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D36839

llvm-svn: 311186
2017-08-18 18:43:30 +00:00
Marek Sokolowski 5cd3d5c8d6 Reapply: [llvm-rc] Add basic RC scripts parsing ability.
As for now, the parser supports a limited set of statements and
resources. This will be extended in the following patches.

Thanks to Nico Weber (thakis) for his original work in this area.

This patch was originally submitted as r311175 and got reverted
in r311177 because of the problems with compilation under gcc.

Differential Revision: https://reviews.llvm.org/D36340

llvm-svn: 311184
2017-08-18 18:24:17 +00:00
Jonas Devlieghere e101b07a1d [Debug info] Transfer DI to fragment expressions for split integer values.
This patch teaches the SDag type legalizer how to split up debug info for
integer values that are split into a hi and lo part.

(re-commit)

Differential Revision: https://reviews.llvm.org/D36805

llvm-svn: 311181
2017-08-18 18:07:00 +00:00
Ben Dunbobbin 7b76de2dcb [lit] support unsetting env variables (again!)
This is an updated version of https://reviews.llvm.org/D22144 by @jlpeyton.

The patch was accepted but not landed.

This is useful functionality and I would like to use this to enable lit tests for environment variable behaviour.

Differential Revision: https://reviews.llvm.org/D36403

llvm-svn: 311180
2017-08-18 17:32:57 +00:00
Konstantin Zhuravlyov f5d826a294 AMDGPU/NFC: Rename few things in SIMemoryLegalizer:
- AtomicInfo -> MemOpInfo
  - getAtomicLoadInfo -> getLoadInfo
  - getAtomicStoreInfo -> getStoreInfo
  - expandAtomicLoad -> expandLoad
  - expandAtomicStore -> expandStore

Differential Revision: https://reviews.llvm.org/D36861

llvm-svn: 311179
2017-08-18 17:30:02 +00:00
Marek Sokolowski f276f52014 Revert "[llvm-rc] Add basic RC scripts parsing ability."
This reverts commit r311175.

This failed some buildbots compilation.

llvm-svn: 311177
2017-08-18 17:25:55 +00:00
Jakub Kuderski 756c09a58f [Dominators] Don't print the whole tree when running with -debug
As the incremental API is now used in several transforms, printing
the whole dominator tree creates a lot of noise when running with
the `-debug` flag. This patch fixes that.

llvm-svn: 311176
2017-08-18 17:06:37 +00:00
Marek Sokolowski dbc16476c1 [llvm-rc] Add basic RC scripts parsing ability.
As for now, the parser supports a limited set of statements and
resources. This will be extended in the following patches.

Thanks to Nico Weber (thakis) for his original work in this area.

Differential Revision: https://reviews.llvm.org/D36340

llvm-svn: 311175
2017-08-18 17:05:47 +00:00
Ben Dunbobbin ac6a5aab45 [Support] env vars with empty values on windows
An environment variable can be in one of three states:

1. undefined.
2. defined with a non-empty value.
3. defined but with an empty value.

The windows implementation did not support case 3
(it was not handling errors). The Linux implementation
is already correct.

Differential Revision: https://reviews.llvm.org/D36394

llvm-svn: 311174
2017-08-18 16:55:44 +00:00
Simon Pilgrim 879ce046ad [X86][BMI2] Added scheduling test for RORX/SARX/SHLX/SHRX instructions
llvm-svn: 311171
2017-08-18 16:26:39 +00:00
Brian Gesiak 8953a7c544 [Lexicon] Add "GEP"
Summary:
`getelementptr` is frequently abbreviated as "GEP", often in source files that
do not ever reference the full name of the instruction. Add it to the Lexicon,
in case readers go to look for what it means there.

Test plan:
1. `ninja sphinx`
2. Confirm that the rendered docs HTML contains the new "GEP" entry

llvm-svn: 311168
2017-08-18 15:35:53 +00:00
Simon Pilgrim 358aeae7b8 [X86][AES] Add scheduling latency/throughput tests for AES instructions
llvm-svn: 311167
2017-08-18 15:26:51 +00:00
Simon Pilgrim 9eb0869e91 [X86][PCLMUL] Add scheduling latency/throughput test for PCLMULQDQ instruction
Added it to the SSE42 tests as targets seem to always have both

llvm-svn: 311166
2017-08-18 15:08:30 +00:00
Simon Pilgrim ccaec26175 [X86][SHA] Add scheduling latency/throughput tests for SHA instructions
llvm-svn: 311164
2017-08-18 14:55:50 +00:00
Simon Pilgrim 7f506f7d72 [X86][MOVBE] Add scheduling latency/throughput tests for MOVBE instructions
llvm-svn: 311163
2017-08-18 14:44:31 +00:00
Sam Parker 04a7db5915 [ARM] Add PostRAScheduler option
This patch adds the option to allow also using the PostRA scheduler,
which brings the ARM backend inline with AArch64 targets. The
SchedModel can also set 'PostRAScheduler', as the R52 does, so also
query this property in the overridden function.

Differential Revision: https://reviews.llvm.org/D36866

llvm-svn: 311162
2017-08-18 14:27:51 +00:00
Simon Dardis 02c9a3dfc3 [mips] Follow up comments on r310460
Use dblaikie's suggestion of cast<> instead of a seperate assert.

llvm-svn: 311160
2017-08-18 13:27:02 +00:00
Simon Pilgrim 320f89782a [X86][BMI2] Added scheduling test for MULX instructions
llvm-svn: 311159
2017-08-18 13:22:18 +00:00
Sjoerd Meijer ec9581e5e0 [AArch64] Do not promote f16 when subtarget HasFullFP16
Armv8.2-A adds FP16 support, i.e. f16 is not only a storage-only type, but it
also supports performing data processing on 16-bit floating-point quantities.
All the necessary (tablegen) groundwork of adding the ARMv8.2-A FP16 (scalar)
instructions was done in D15014. To take advantage of this, this patch avoids
promotion of f16 to f32 types when the subtarget supports FullFP16, which
enables instruction selection of these FP16 instructions.

Differential Revision: https://reviews.llvm.org/D36396

llvm-svn: 311154
2017-08-18 10:51:14 +00:00
Renato Golin 6fd16d37ae [Triple] Define OS Check for Haiku
This adds the OS check for the Haiku operating system, as it was
missing in the Triple class. Tests for x86_64-unknown-haiku and
i586-pc-haiku were also added.

These patches only affect Haiku and are completely harmless for
other platforms.

Patch by Calvin Hill <calvin@hakobaito.co.uk>

llvm-svn: 311153
2017-08-18 10:35:42 +00:00
Ilya Biryukov 827c8acc21 Addressed some security issues in Dockerfiles.
Summary:
- Removed --trust-server-cert from `svn checkout` invocations.
  Installing 'ca-certificates' package on ubuntu adds required CAs to
  the system and svn can do proper checkout using https.

- Added checksum verification when installing cmake from cmake.org.

Reviewers: mehdi_amini, klimek

Reviewed By: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36673

llvm-svn: 311152
2017-08-18 09:37:23 +00:00
Diana Picus 42ea77d5c2 Revert "GlobalISel (AArch64): fix ABI at border between GPRs and SP."
This reverts commit e8fd20964798ca6d46d2729dd3a789707a6416da in an
attempt to appease the GlobalISel buildbot, which fails in the
test-suite with errors like
fpcmp: files differ without tolerance allowance

llvm-svn: 311151
2017-08-18 09:31:21 +00:00
Sam Parker 25efe769c0 [AArch64] Fix for buildbots, unused function
Removing function declaration, my previous commit broke the bots.

llvm-svn: 311150
2017-08-18 09:08:05 +00:00
Victor Leschuk 091da14423 Remove useless default case in switch
llvm-svn: 311149
2017-08-18 09:02:06 +00:00
Sam Parker 96f8959cfd [AArch64] Remove DecodeAuthLoadWriteback
The BaseAuthLoad instruction class was incorrectly passing an empty
constraint string to its parent, so I have corrected this. This makes
the DecodeAuthLoadWriteback function redundant, so I've also removed
it.

Differential Revision: https://reviews.llvm.org/D36741

llvm-svn: 311148
2017-08-18 08:39:54 +00:00
Alex Bradbury f698a29a51 Refine report_fatal_error guidance after post-commit review
Use text suggested by Justin Bogner in post-commit review of r311146 
<http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170814/479898.html>, 
which makes it clear that report_fatal_error shouldn't be used when there is a 
practicable alternative. Also make this clearer in CodingStandards.

llvm-svn: 311147
2017-08-18 06:45:34 +00:00
Alex Bradbury 7182440fbd Give guidance on report_fatal_error in CodingStandards.rst and ProgrammersManual.rst
The current ProgrammersManual.rst document has a lot of well-written 
documentation on error handling thanks to @lhames. It suggests errors can be 
split cleanly into "programmatic" and "recoverable" errors. However, the 
reality in current LLVM seems to be there are a number of cases where a 
non-programmatic error is not easily recoverable. Therefore, add a note to 
indicate the existence of report_fatal_error for these cases. I've also added 
a reminder to CodingStandards.rst in the section on assertions, to indicate 
that llvm_unreachable and assertions should not be relied upon to report 
errors triggered by user input.

The ProgrammersManual is also silent on the use of LLVMContext::diagnose, 
which is used in BPF+WebAssembly+AMDGPU to report some errors during 
instruction selection. I don't address that in this patch, as it's not quite 
clear how to fit in to the current error handling story

Differential Revision: https://reviews.llvm.org/D36826

llvm-svn: 311146
2017-08-18 05:29:21 +00:00
Craig Topper e3edd9c9be [DAGCombiner] Fix bad comment that had immediate values swapped from the code and what they need to be to make sense. NFC
llvm-svn: 311144
2017-08-18 04:52:46 +00:00
Jatin Bhateja e739fc7d11 Test commit access
Summary: Adding a blank line.

Differential Revision: https://reviews.llvm.org/D36859

llvm-svn: 311143
2017-08-18 02:39:28 +00:00
Geoff Berry bd47e8a4f7 Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" round 2
This reverts commit r311135.

sanitizer-x86_64-linux-android buildbot is timing out with just this
patch applied.

llvm-svn: 311142
2017-08-18 01:43:11 +00:00
Richard Smith c0541dfa3e Increase tail dup threshold for -O3 from 3 to 4.
We see a modest performance improvement from this slightly higher tail dup threshold.

Differential Revision: https://reviews.llvm.org/D36775

llvm-svn: 311139
2017-08-17 23:38:41 +00:00
Craig Topper 1fae3ae6f0 [X86] Remove SSE/AVX patterns for AND/XOR/OR/ANDN that checked for the inputs being bitcasted from floating point types.
There's really no reason to do this we should just let isel pick the integer version and let the execution dependency fixing pass take care of moving to FP if necessary.

It's not very reliable to look for bitcasts at the edges of patterns. If for some reason one input was bitcasted and the other wasn't, or if one was a v4f32 bitcast and one was a v2f64 bitcast, we would have fallen back to the integer pattern anyway.

llvm-svn: 311138
2017-08-17 23:20:57 +00:00
Tim Northover 48fff995d6 GlobalISel (AArch64): fix ABI at border between GPRs and SP.
If a struct would end up half in GPRs and half on SP the ABI says it should
actually go entirely on the stack. We were getting this wrong in GlobalISel
before, causing compatibility issues.

llvm-svn: 311137
2017-08-17 23:14:01 +00:00
Geoff Berry 51f52c4fca Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Two issues identified by buildbots were addressed:
    - The pass no longer forwards COPYs to physical register uses, since
      doing so can break code that implicitly relies on the physical
      register number of the use.
    - The pass no longer forwards COPYs to undef uses, since doing so
      can break the machine verifier by creating LiveRanges that don't
      end on a use (since the undef operand is not considered a use).

    [MachineCopyPropagation] Extend pass to do COPY source forwarding

    This change extends MachineCopyPropagation to do COPY source forwarding.

    This change also extends the MachineCopyPropagation pass to be able to
    be run during register allocation, after physical registers have been
    assigned, but before the virtual registers have been re-written, which
    allows it to remove virtual register COPY LiveIntervals that become dead
    through the forwarding of all of their uses.

    Reviewers: qcolombet, javed.absar, MatzeB, jonpa

    Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny

    Differential Revision: https://reviews.llvm.org/D30751

llvm-svn: 311135
2017-08-17 23:06:55 +00:00
Zachary Turner 4c432b202f Fix warning about covered switch default.
llvm-svn: 311129
2017-08-17 22:20:15 +00:00
Tom Stellard a096b12628 AMDGPU: Add R600InstPrinter class
Summary:
This is step towards separating the GCN and R600 tablegen'd code.

This is a little awkward for now, because the R600 functions won't have the
MCSubtargetInfo parameter, so we need to have AMDMGPUInstPrinter
delegate to R600InstPrinter, but once the tablegen'd code is split,
we will be able to drop the delegation and use R600InstPrinter directly.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D36444

llvm-svn: 311128
2017-08-17 22:20:04 +00:00
Jakub Kuderski e608ef7635 [LoopRotate][Dominators] Use the incremental API to update DomTree
Summary: This patch teaches LoopRotate to use the new incremental API to update the DominatorTree.

Reviewers: dberlin, davide, grosser, sanjoy

Reviewed By: dberlin, davide

Subscribers: hiraditya, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D35581

llvm-svn: 311125
2017-08-17 21:48:19 +00:00
Eugene Zelenko 6e07bfd0d9 [CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 311124
2017-08-17 21:26:39 +00:00
Zachary Turner 197bba0028 Remove unused variable.
llvm-svn: 311119
2017-08-17 20:18:36 +00:00
Zachary Turner 96bcd6a37a [llvm-pdbutil] Fix some dumping issues.
When dumping, we were treating the S_INLINESITESYM as referring
to a type record, when it actually refers to an id record.  We
had this correct in TypeIndexDiscovery, so our merging algorithm
should be fine, but we had it wrong in the dumper, which means it
would appear to work most of the time, unless the index was out
of bounds in the type stream, when it would fail.  Fixed this, and
audited a few other cases to make them match the behavior in
TypeIndexDiscovery.

Also, I've now observed a new symbol record with kind 0x1168 which
I have no clue what it is, so to avoid crashing we have to just
print "Unknown Symbol Kind".

llvm-svn: 311117
2017-08-17 20:04:51 +00:00
Zachary Turner f401e1102d Fix a few minor issues when dumping symbols.
1) We weren't handling symbol types that weren't able to parse,
   even if we knew what the leaf type was.  This was triggering
   when trying to dump /DEBUG:FASTLINK PDBs, where we expect a
   certain symbol to show up, but we just don't know how to parse
   it.
2) We lost the code for dumping record bytes, so this was added
   back.

llvm-svn: 311116
2017-08-17 20:04:31 +00:00
Lang Hames df1e59b640 [docs] Tweak phrasing of the varargs explanation in the command section of the
CMake primer.

This moves the introduction of the ARGV/ARGN variables up to immmediately follow
the introduction of the concept of variable argument functions, and explicitly
connects this concept to C varargs functions.

llvm-svn: 311113
2017-08-17 18:21:53 +00:00
Lang Hames 3a6a2d26c6 [docs] Fix typo and tweak wording of special variable handling in CMake primer.
llvm-svn: 311112
2017-08-17 18:00:28 +00:00
Jonas Devlieghere 30756da212 Revert "[Debug info] Transfer DI to fragment expressions for split integer values."
This reverts commit r311102.

llvm-svn: 311111
2017-08-17 17:58:33 +00:00
Alexey Bataev 84ad9ae032 [SimplifyCFG] Add a test for preserve store alignment, NFC.
llvm-svn: 311106
2017-08-17 17:26:52 +00:00