Commit Graph

167474 Commits

Author SHA1 Message Date
Simon Pilgrim e447a273bd [X86] Use isNullConstant helper. NFCI.
llvm-svn: 338516
2018-08-01 11:24:11 +00:00
Simon Pilgrim 18d025a732 [llvm-mca][x86] Add STC + STD instruction resource tests
llvm-svn: 338514
2018-08-01 11:00:11 +00:00
Andrea Di Biagio 5291df3c98 [llvm-mca] Improve code comments. NFC.
llvm-svn: 338513
2018-08-01 10:49:01 +00:00
Jonas Devlieghere f73205b08b [DebugInfo] Remove ambiguity to fix Windows bots
Should fix the MSVC bots by explicitly invoking
llvm::make_reverse_iterator to remove ambiguity with
std::make_reverse_iterator.

llvm-svn: 338511
2018-08-01 10:40:08 +00:00
Jonas Devlieghere 2c045cf9d0 [DebugInfo] Improve consistency in DWARFDie.h (NFC)
Follow-up for r338506 with some unrelated changes in formatting and
consistency.

llvm-svn: 338509
2018-08-01 10:30:34 +00:00
Andrew V. Tischenko dad919d357 [X86] Improved sched models for X86 BT*rr instructions.
Differential Revision: https://reviews.llvm.org/D49243

llvm-svn: 338507
2018-08-01 10:24:27 +00:00
Jonas Devlieghere 3a04bea91a [DebugInfo] Have custom std::reverse_iterator<DWARFDie>
The DWARFDie is a lightweight utility wrapper that stores a pointer to a
compile unit and a debug info entry. Currently, its iterator (used for
walking over its children) stores a DWARFDie and returns a const
reference when dereferencing it.

When the iterator is modified (by incrementing or decrementing it), this
reference becomes invalid. This was happening when calling reverse on
it, because the std::reverse_iterator is keeping a temporary copy of the
iterator (see
https://en.cppreference.com/w/cpp/iterator/reverse_iterator for a good
illustration).

The relevant code in libcxx:

  reference operator*() const {_Iter __tmp = current; return *--__tmp;}

When dereferencing the reverse iterator, we decrement and return a
reference to a DWARFDie stored in the stack frame of this function,
resulting in UB at runtime.

This patch specifies the std::reverse_iterator for DWARFDie to do the
right thing.

Differential revision: https://reviews.llvm.org/D49679

llvm-svn: 338506
2018-08-01 10:24:17 +00:00
Petar Jovanovic 64c10ba8e2 [MIPS GlobalISel] Select global address
Select G_GLOBAL_VALUE for position dependent code.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D49803

llvm-svn: 338499
2018-08-01 09:03:23 +00:00
David Bolvansky fbbb83c782 Revert "Enrich inline messages", tests fail
llvm-svn: 338496
2018-08-01 08:02:40 +00:00
Hans Wennborg f23db0d7a2 Add llvm-rc to LLVM_TOOLCHAIN_TOOLS (PR38386)
This means it will be installed also in builds configured with
LLVM_INSTALL_TOOLCHAIN_ONLY, such as the Windows packages.

llvm-svn: 338495
2018-08-01 07:51:55 +00:00
David Bolvansky 7f36cd9d96 Enrich inline messages
Summary:
This patch improves Inliner to provide causes/reasons for negative inline decisions.
1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message.
2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision.
3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost.
4. Adjusted tests for changed printing.

Patch by: yrouban (Yevgeny Rouban)


Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00

Reviewed By: tejohnson, xbolva00

Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith

Differential Revision: https://reviews.llvm.org/D49412

llvm-svn: 338494
2018-08-01 07:37:16 +00:00
Martin Storsjo d4590c38ab [AArch64] Disallow the MachO specific .loh directive for windows
Also add a test for it being unsupported for linux.

Differential Revision: https://reviews.llvm.org/D49929

llvm-svn: 338493
2018-08-01 06:50:18 +00:00
Craig Topper 65a1388881 [X86] When looking for (CMOV C-1, (ADD (CTTZ X), C), (X != 0)) -> (ADD (CMOV (CTTZ X), -1, (X != 0)), C), make sure we really have a compare with 0.
It's not strictly required by the transform of the cmov and the add, but it makes sure we restrict it to the cases we know we want to match.

While there canonicalize the operand order of the cmov to simplify the matching and emitting code.

llvm-svn: 338492
2018-08-01 06:36:20 +00:00
Victor Leschuk 64e0c56717 [DWARF] Basic support for producing DWARFv5 .debug_addr section
This revision implements support for generating DWARFv5 .debug_addr section.
The implementation is pretty straight-forward: we just check the dwarf version
and emit section header if needed.

Reviewers: aprantl, dblaikie, probinson

Reviewed by: dblaikie

Differential Revision: https://reviews.llvm.org/D50005

llvm-svn: 338487
2018-08-01 05:48:06 +00:00
Hiroshi Inoue 02f79eae06 [InstSimplify] fold extracting from std::pair (1/2)
This patch intends to enable jump threading when a method whose return type is std::pair<int, bool> or std::pair<bool, int> is inlined.
For example, jump threading does not happen for the if statement in func.

std::pair<int, bool> callee(int v) {
  int a = dummy(v);
  if (a) return std::make_pair(dummy(v), true);
  else return std::make_pair(v, v < 0);
}

int func(int v) {
  std::pair<int, bool> rc = callee(v);
  if (rc.second) {
    // do something
  }

SROA executed before the method inlining replaces std::pair by i64 without splitting in both callee and func since at this point no access to the individual fields is seen to SROA.
After inlining, jump threading fails to identify that the incoming value is a constant due to additional instructions (like or, and, trunc).

This series of patch add patterns in InstructionSimplify to fold extraction of members of std::pair. To help jump threading, actually we need to optimize the code sequence spanning multiple BBs.
These patches does not handle phi by itself, but these additional patterns help NewGVN pass, which calls instsimplify to check opportunities for simplifying instructions over phi, apply phi-of-ops optimization to result in successful jump threading. 
SimplifyDemandedBits in InstCombine, can do more general optimization but this patch aims to provide opportunities for other optimizers by supporting a simple but common case in InstSimplify.

This first patch in the series handles code sequences that merges two values using shl and or and then extracts one value using lshr.

Differential Revision: https://reviews.llvm.org/D48828

llvm-svn: 338485
2018-08-01 04:40:32 +00:00
Hsiangkai Wang 2089e4c8f1 [DebugInfo] Fix build failed in clang-x86_64-linux-selfhost-modules.
Only generate symbol difference expression if needed.

llvm-svn: 338484
2018-08-01 04:17:41 +00:00
Jatin Bhateja 36432a70c1 [X86] Adding more test patterns for lea-opt (PR37939)
Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50128

llvm-svn: 338483
2018-08-01 03:53:27 +00:00
Chandler Carruth 2ce191e220 [x86] Fix a really subtle miscompile due to a somewhat glaring bug in
EFLAGS copy lowering.

If you have a branch of LLVM, you may want to cherrypick this. It is
extremely unlikely to hit this case empirically, but it will likely
manifest as an "impossible" branch being taken somewhere, and will be
... very hard to debug.

Hitting this requires complex conditions living across complex control
flow combined with some interesting memory (non-stack) initialized with
the results of a comparison. Also, because you have to arrange for an
EFLAGS copy to be in *just* the right place, almost anything you do to
the code will hide the bug. I was unable to reduce anything remotely
resembling a "good" test case from the place where I hit it, and so
instead I have constructed synthetic MIR testing that directly exercises
the bug in question (as well as the good behavior for completeness).

The issue is that we would mistakenly assume any SETcc with a valid
condition and an initial operand that was a register and a virtual
register at that to be a register *defining* SETcc...

It isn't though....

This would in turn cause us to test some other bizarre register,
typically the base pointer of some memory. Now, testing this register
and using that to branch on doesn't make any sense. It even fails the
machine verifier (if you are running it) due to the wrong register
class. But it will make it through LLVM, assemble, and it *looks*
fine... But wow do you get a very unsual and surprising branch taken in
your actual code.

The fix is to actually check what kind of SETcc instruction we're
dealing with. Because there are a bunch of them, I just test the
may-store bit in the instruction. I've also added an assert for sanity
that ensure we are, in fact, *defining* the register operand. =D

llvm-svn: 338481
2018-08-01 03:01:58 +00:00
Chandler Carruth 014047a99a [x86/slh] Add unwind info to several tests to make it more obvious that
we aren't incorrectly generating any of it when doing SLH.

There was a bug that only occured with SLH that very much looked like it
could be caused by bad unwind info, and so this was a prime suspect.
Turns out that everything is fine, but this way we'll *see* if we end
up, for example, putting things we shouldn't inside the prolog.

llvm-svn: 338480
2018-08-01 03:01:10 +00:00
Hsiangkai Wang 5c63af0d04 [DebugInfo] Generate fixups as emitting DWARF .debug_line.
It is necessary to generate fixups in .debug_line as relaxation is
enabled due to the address delta may be changed after relaxation.

DWARF will record the mappings of lines and addresses in
.debug_line section. It will encode the information using special
opcodes, standard opcodes and extended opcodes in Line Number
Program. I use DW_LNS_fixed_advance_pc to encode fixed length
address delta and DW_LNE_set_address to encode absolute address
to make it possible to generate fixups in .debug_line section.

Differential Revision: https://reviews.llvm.org/D46850

llvm-svn: 338477
2018-08-01 02:18:06 +00:00
Amara Emerson 6cdfe29d8e [GlobalISel][IRTranslator] Use RPO traversal when visiting blocks to translate.
Previously we were just visiting the blocks in the function in IR order, which
is rather arbitrary. Therefore we wouldn't always visit defs before uses, but
the translation code relies on this assumption in some places.

Only codegen change seen in tests is an elision of a redundant copy.

Fixes PR38396

llvm-svn: 338476
2018-08-01 02:17:42 +00:00
Konstantin Zhuravlyov bb30ef7af4 AMDGPU: Add clamp bit to dot intrinsics
Differential Revision: https://reviews.llvm.org/D49874

llvm-svn: 338470
2018-08-01 01:31:30 +00:00
Eric Christopher 7a70be6865 Simplify selectELFSectionForGlobal by pulling out the entry size
determination for mergeable sections into a small static function.

llvm-svn: 338469
2018-08-01 01:29:30 +00:00
Eric Christopher ad36c74562 Tidy up logic around unique section name creation and remove a
mostly unused variable.

llvm-svn: 338468
2018-08-01 01:03:34 +00:00
Eli Friedman da08078fb2 [MachineOutliner] Clean up subtarget handling.
Call shouldOutlineFromFunctionByDefault, isFunctionSafeToOutlineFrom,
getOutliningType, and getMachineOutlinerMBBFlags using the correct
TargetInstrInfo. And don't create a MachineFunction for a function
declaration.

The call to getOutliningCandidateInfo is still a little weird, but at
least the weirdness is explicitly called out.

Differential Revision: https://reviews.llvm.org/D49880

llvm-svn: 338465
2018-08-01 00:37:20 +00:00
Evandro Menezes eb159e56a6 [PATCH] [SLC] Test simplification of pow() for vector types (NFC)
Add test case for the simplification of `pow()` for vector types that D50035
enables.

llvm-svn: 338463
2018-08-01 00:30:43 +00:00
Reid Kleckner b32ff46ff7 Revert r338354 "[ARM] Revert r337821"
Disable ARMCodeGenPrepare by default again. It is causing verifier
failues in V8 that look like:

  Duplicate integer as switch case
  switch i32 %trunc, label %if.end13 [
    i32 0, label %cleanup36
    i32 0, label %if.then8
  ], !dbg !4981
  i32 0
  fatal error: error in backend: Broken function found, compilation aborted!

I will continue reducing the test case and send it along.

llvm-svn: 338452
2018-07-31 23:09:42 +00:00
David L. Jones 9fb2e3ceaa [WebAssembly] Fix debug info tests after r338437.
After r338437, debug_ranges are no longer emitted. Previously, this was only
done for DWARF version 5 and above.

llvm-svn: 338448
2018-07-31 22:24:14 +00:00
Victor Leschuk 58d3399d8a [DWARF] Support for .debug_addr (consumer)
This patch implements basic support for parsing
  and dumping DWARFv5 .debug_addr section.

llvm-svn: 338447
2018-07-31 22:19:19 +00:00
Evandro Menezes 61e4e40750 [SLC] Refactor the simplication of pow() (NFC)
Reword comments and minor code reformatting.

llvm-svn: 338446
2018-07-31 22:11:02 +00:00
Fangrui Song 87b4b8f7b4 [llvm-objcopy] Make --strip-debug strip .gdb_index
Summary:
See binutils-gdb/bfd/elf.c, GNU objcopy also strips .stab* (STABS)
.line* (DWARF 1) .gnu.linkonce.wi.* (linkonce section for .debug_info) but
I'm not sure we need to be compatible with it.

Reviewers: dblaikie, alexshap, jakehehrlich, jhenderson

Reviewed By: alexshap, jakehehrlich

Subscribers: aprantl, JDevlieghere, jakehehrlich, llvm-commits

Differential Revision: https://reviews.llvm.org/D50100

llvm-svn: 338443
2018-07-31 21:26:35 +00:00
George Burgess IV 497e8fad51 Revert r338431: "Add DebugCounters to DivRemPairs"
This reverts r338431; the test it added is making buildbots unhappy.
Locally, I can repro the failure on reverse-iteration builds.

llvm-svn: 338442
2018-07-31 21:18:44 +00:00
Wolfgang Pieb baf94f830b [DWARF] Do not create a .debug_ranges section when no ranges are needed.
Reviewers: aprantl

Differential Revision: https://reviews.llvm.org/D50089

llvm-svn: 338437
2018-07-31 20:56:32 +00:00
Matt Arsenault 118c47b6d1 AMDGPU: Split amdgcn/r600 fminnum/fmaxnum tests
R600 breaks on too many things to usefully test changes
with ieee_mode on vs. off.

llvm-svn: 338435
2018-07-31 20:38:42 +00:00
George Burgess IV 907f4f6a74 Add DebugCounters to DivRemPairs
For people who don't use DebugCounters, NFCI.

Patch by Zhizhou Yang!

Differential Revision: https://reviews.llvm.org/D50033

llvm-svn: 338431
2018-07-31 20:07:46 +00:00
Matt Davis 5ceaa98a4e [llvm-mca] Update the help text to reflect "physical" registers. NFC.
llvm-svn: 338430
2018-07-31 20:05:08 +00:00
Jonas Paulsson 590b1fc881 [SystemZ] Fix bad assert composition.
Use '&&' before the string instead of '||'

llvm-svn: 338429
2018-07-31 19:58:42 +00:00
Matt Arsenault dcec0888e2 DAG: Correct pointer type used for stack slot
Correct the address space for the inserted argument
stack slot.

AMDGPU seems to not do anything with this information,
so I don't think this was breaking anything.

llvm-svn: 338428
2018-07-31 19:51:20 +00:00
Alexandre Ganea b92d3ad762 [CodeView] Add coverage test for r338308 (Fixed crash in type merging)
llvm-svn: 338423
2018-07-31 19:30:03 +00:00
Matt Arsenault feedabfde7 AMDGPU: Break 64-bit arguments into 32-bit pieces
llvm-svn: 338421
2018-07-31 19:29:04 +00:00
Matt Arsenault 0395da7842 AMDGPU: Split wide vectors of i16/f16 into 32-bit regs on calls
This improves code for the same reasons as scalarizing 32-bit
element vectors.

llvm-svn: 338418
2018-07-31 19:17:47 +00:00
Alexandre Ganea ee8a720051 [CodeView] Minimal support for S_UNAMESPACE records
Differential Revision: https://reviews.llvm.org/D50007

llvm-svn: 338417
2018-07-31 19:15:50 +00:00
Matt Arsenault 9ced1e0d80 AMDGPU: Scalarize vector argument types to calls
When lowering calling conventions, prefer to decompose vectors
into the constitute register types. This avoids artifical constraints
to satisfy a wide super-register.

This improves code quality because now optimizations don't need to
deal with the super-register constraint. For example the immediate
folding code doesn't deal with 4 component reg_sequences, so by
breaking the register down earlier the existing immediate folding
code is able to work.

This also avoids the need for the shader input processing code
to manually split vector types.

llvm-svn: 338416
2018-07-31 19:05:14 +00:00
Matt Davis e8c70bc187 [llvm-mca][docs] Replace "temporary" with "physical registers". NFC.
llvm-svn: 338415
2018-07-31 18:59:46 +00:00
Simon Pilgrim 67caf04d3a [X86] WriteBSWAP sched classes are reg-reg only.
Don't declare them as X86SchedWritePair when the folded class will never be used.

Note: MOVBE (load/store endian conversion) instructions tend to have a very different behaviour to BSWAP.
llvm-svn: 338412
2018-07-31 18:24:24 +00:00
Andrea Di Biagio 1dac6ba7e2 [llvm-mca][docs] Improve the "How LLVM-MCA works" section.
llvm-svn: 338410
2018-07-31 18:19:15 +00:00
Vlad Tsyrklevich 48ed9acede Revert "[DebugInfo] Generate DWARF debug information for labels."
This reverts commits r338390 and r338398, they were causing LSan
failures on the ASan bot.

llvm-svn: 338408
2018-07-31 18:10:37 +00:00
Simon Pilgrim 5d9b00d15b [X86][SSE] Use ISD::MULHU for constant/non-zero ISD::SRL lowering (PR38151)
As was done for vector rotations, we can efficiently use ISD::MULHU for vXi8/vXi16 ISD::SRL lowering.

Shift-by-zero cases are still problematic (mainly on v32i8 due to extra AND/ANDN/OR or VPBLENDVB blend masks but v8i16/v16i16 aren't great either if PBLENDW fails) so I've limited this first patch to known non-zero cases if we can't easily use PBLENDW.

Differential Revision: https://reviews.llvm.org/D49562

llvm-svn: 338407
2018-07-31 18:05:56 +00:00
Rui Ueyama 7f97570e79 Make ICF log output order deterministic.
This patch does the same thing as r338153 for COFF.
Note that this patch affects only the order of log messages.
The output file is already deterministic.

Differential Revision: https://reviews.llvm.org/D50023

llvm-svn: 338406
2018-07-31 18:04:58 +00:00
Simon Pilgrim 1f4b9cb6fe [llvm-mca][x86] Add 32-bit instruction resource tests
These aren't exhaustive, but cover some instructions that are only available in 32-bit mode (where would we be without good BCD math performance?).

llvm-svn: 338404
2018-07-31 17:33:08 +00:00
Zachary Turner d30700f82d Resubmit r338340 "[MS Demangler] Better demangling of template arguments."
This broke the build with GCC, but has since been fixed.

llvm-svn: 338403
2018-07-31 17:16:44 +00:00
Craig Topper bef126fb71 [X86] Add pattern matching for PMADDUBSW
Summary:
Similar to D49636, but for PMADDUBSW. This instruction has the additional complexity that the addition of the two products saturates to 16-bits rather than wrapping around. And one operand is treated as signed and the other as unsigned.

A C example that triggers this pattern

```
static const int N = 128;

int8_t A[2*N];
uint8_t B[2*N];
int16_t C[N];

void foo() {
  for (int i = 0; i != N; ++i)
    C[i] = MIN(MAX((int16_t)A[2*i]*(int16_t)B[2*i] + (int16_t)A[2*i+1]*(int16_t)B[2*i+1], -32768), 32767);
}
```

Reviewers: RKSimon, spatel, zvi

Reviewed By: RKSimon, zvi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49829

llvm-svn: 338402
2018-07-31 17:12:08 +00:00
Craig Topper d03d44e0b9 [X86] Add test cases that could use PMADDUBSW.
llvm-svn: 338401
2018-07-31 17:12:06 +00:00
Francis Visoiu Mistrih ae8002c1cf [X86] Preserve more liveness information in emitStackProbeInline
This commit fixes two issues with the liveness information after the
call:

1) The code always spills RCX and RDX if InProlog == true, which results
in an use of undefined phys reg.
2) FinalReg, JoinReg, RoundedReg, SizeReg are not added as live-ins to
the basic blocks that use them, therefore they are seen undefined.

https://llvm.org/PR38376

Differential Revision: https://reviews.llvm.org/D50020

llvm-svn: 338400
2018-07-31 16:41:12 +00:00
Hsiangkai Wang 68c6860434 [DebugInfo] Fix build failed in 'clang-cmake-armv8-full'.
Builder clang-cmake-armv8-full failed due to the assembly 'comment'
notation is not '#' in the target. So, I use CHECK-SAME to avoid to
check the comment notation in the same line in the test case.

llvm-svn: 338398
2018-07-31 16:22:09 +00:00
Jakub Kuderski 3ae770aa2b [Dominators] Make slow walks shorter
Summary:
When DFS numbers are not yet calculated for a dominator tree, we have to walk it up to say whether one node dominates some other.

This patch makes the slow walks shorter by only walking until the level of the node we check against is reached. This is because a node cannot possibly dominate something higher in its tree.

When running opt with -O3, the patch results in:
* 25% fewer loop iterations for `opt` (fullLTO)
* 30% fewer loop iterations for sqlite

Reviewers: brzycki, asbirlea, chandlerc, NutshellySima, grosser

Reviewed By: NutshellySima

Subscribers: mehdi_amini, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D49955

llvm-svn: 338396
2018-07-31 15:53:10 +00:00
Ewan Crawford d83beb804c Fix InstCombine address space assert
Workaround bug where the InstCombine pass was asserting on the IR added in lit
test, where we have a bitcast instruction after a GEP from an addrspace cast.

The second bitcast in the test was getting combined into
`bitcast <16 x i32>* %0 to <16 x i32> addrspace(3)*`, which looks like it should
be an addrspace cast instruction instead. Otherwise if control flow is allowed
to continue as it is now we create a GEP instruction
`<badref> = getelementptr inbounds <16 x i32>, <16 x i32>* %0, i32 0`. However
because the type of this instruction doesn't match the address space we hit an
assert when replacing the bitcast with that GEP.

```
void llvm::Value::doRAUW(llvm::Value*, bool): Assertion `New->getType() == getType() && "replaceAllUses of value with new value of different type!"' failed.
```

Differential Revision: https://reviews.llvm.org/D50058

llvm-svn: 338395
2018-07-31 15:53:03 +00:00
Andrea Di Biagio bdcf6ad60d [llvm-mca][docs] Always use `llvm-mca` in place of `MCA`.
llvm-svn: 338394
2018-07-31 15:29:10 +00:00
Sanjay Patel a35781fdf9 [InstCombine] regenerate checks and add tests for D50035; NFC
llvm-svn: 338392
2018-07-31 15:07:32 +00:00
Anastasis Grammenos ac3f8028da [DebugInfo][LCSSA] Preserve debug location in lcssa phis
Summary:
When inserting lcssa Phi Nodes in the exit block
mak sure to preserve the original instructions DL.

Reviewers: vsk

Subscribers: JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D50009

llvm-svn: 338391
2018-07-31 14:54:52 +00:00
Hsiangkai Wang cbc58ada99 [DebugInfo] Generate DWARF debug information for labels.
There are two forms for label debug information in DWARF format.

1. Labels in a non-inlined function:

DW_TAG_label
  DW_AT_name
  DW_AT_decl_file
  DW_AT_decl_line
  DW_AT_low_pc

2. Labels in an inlined function:

DW_TAG_label
  DW_AT_abstract_origin
  DW_AT_low_pc

We will collect label information from DBG_LABEL. Before every DBG_LABEL,
we will generate a temporary symbol to denote the location of the label.
The symbol could be used to get DW_AT_low_pc afterwards. So, we create a
mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase.
The DBG_LABEL in the mapping is used to query the symbol before it.

The AbstractLabels in DwarfCompileUnit is used to process labels in inlined
functions.

We also keep a mapping between scope and labels in DwarfFile to help to
generate correct tree structure of DIEs.

It also generates label debug information under global isel.

Differential Revision: https://reviews.llvm.org/D45556

llvm-svn: 338390
2018-07-31 14:48:32 +00:00
David Bolvansky ab79414f7b Revert Enrich inline messages
llvm-svn: 338389
2018-07-31 14:47:22 +00:00
Sanjay Patel 57d617d676 [InstCombine] auto-generate checks; NFC
llvm-svn: 338388
2018-07-31 14:27:30 +00:00
David Bolvansky b562dbabda Enrich inline messages
Summary:
This patch improves Inliner to provide causes/reasons for negative inline decisions.
1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message.
2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision.
3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost.
4. Adjusted tests for changed printing.

Patch by: yrouban (Yevgeny Rouban)


Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00

Reviewed By: tejohnson, xbolva00

Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith

Differential Revision: https://reviews.llvm.org/D49412

llvm-svn: 338387
2018-07-31 14:25:24 +00:00
Andrea Di Biagio 4a31bcff3f [llvm-mca] Remove README.txt
A detailed description of the tool has been recently added by Matt to
CommandGuide/llvm-mca.rst. File README.txt is now redundant and can be removed;
all the relevant user-guide information has been improved and then moved to
llvm-mca.rst.

In future, we should add another .rst for the "llvm-mca developer manual" to
provide infromation about:
 - llvm-mca internals.
 - How to add custom stages to the simulated pipeline.
 - How to provide extra processor info in the scheduling model to improve the
   analysis performed by llvm-mca.

llvm-svn: 338386
2018-07-31 14:23:49 +00:00
John Brawn cd5f37f3f1 [MemDep] Use PhiValuesAnalysis to improve alias analysis results
This is being done in order to make GVN able to better optimize certain inputs.
MemDep doesn't use PhiValues directly, but does need to notifiy it when things
get invalidated.

Differential Revision: https://reviews.llvm.org/D48489

llvm-svn: 338384
2018-07-31 14:19:29 +00:00
David Bolvansky 16d8a69b90 [InstSimplify] Fold another Select with And/Or pattern
Summary: Proof: https://rise4fun.com/Alive/L5J

Reviewers: lebedev.ri, spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49975

llvm-svn: 338383
2018-07-31 14:17:15 +00:00
Matt Arsenault a5ed032118 DAG: Fix PromoteFloatResult for fcanonicalize
llvm-svn: 338382
2018-07-31 14:15:22 +00:00
Matt Arsenault 638a202760 AMDGPU: Don't handle FP16_TO_FP in isCanonicalized
This needs more special handling to do correctly.
Fixes test in subsequent commit.

llvm-svn: 338381
2018-07-31 14:15:16 +00:00
Alexey Bataev c0c3a6ed5e [SLP] Fix PR38339: Instruction does not dominate all uses!
Summary:
If the ExtractElement instructions can be optimized out during the
vectorization and we need to reshuffle the parent vector, this
ShuffleInstruction may be inserted in the wrong place causing compiler
to produce incorrect code.

Reviewers: spatel, RKSimon, mkuper, hfinkel, javed.absar

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49928

llvm-svn: 338380
2018-07-31 14:02:43 +00:00
Matt Arsenault 4aec86d37a AMDGPU: Fold undef fcanonicalize to qNaN
We could choose a free 0 for this, but this
matches the behavior for fmul undef, 1.0. Also,
the NaN use is more useful for folding use operations
although if it's not eliminated it is more expensive
in terms of code size.

llvm-svn: 338376
2018-07-31 13:34:31 +00:00
Matt Arsenault c1335eaf7e AMDGPU: Fix test check line bugs
llvm-svn: 338374
2018-07-31 13:25:23 +00:00
Peter Smith 08b54bb4ff [ARM] Complete enumeration values for Tag_ABI_VFP_args
The LLD implementation of Tag_ABI_VFP_args needs to check the rarely seen
values of 3 (toolchain specific) and 4 compatible with both Base and VFP.
Add the missing enumeration values so that LLD can refer to them without
having to use the raw numbers.

Differential Revision: https://reviews.llvm.org/D50049

llvm-svn: 338373
2018-07-31 13:24:49 +00:00
Andrea Di Biagio a1852b6194 [llvm-mca][BtVer2] Teach how to identify dependency-breaking idioms.
This patch teaches llvm-mca how to identify dependency breaking instructions on
btver2.

An example of dependency breaking instructions is the zero-idiom XOR (example:
`XOR %eax, %eax`), which always generates zero regardless of the actual value of
the input register operands.
Dependency breaking instructions don't have to wait on their input register
operands before executing. This is because the computation is not dependent on
the inputs.

Not all dependency breaking idioms are also zero-latency instructions. For
example, `CMPEQ %xmm1, %xmm1` is independent on
the value of XMM1, and it generates a vector of all-ones.
That instruction is not eliminated at register renaming stage, and its opcode is
issued to a pipeline for execution. So, the latency is not zero. 

This patch adds a new method named isDependencyBreaking() to the MCInstrAnalysis
interface. That method takes as input an instruction (i.e. MCInst) and a
MCSubtargetInfo.
The default implementation of isDependencyBreaking() conservatively returns
false for all instructions. Targets may override the default behavior for
specific CPUs, and return a value which better matches the subtarget behavior.

In future, we should teach to Tablegen how to automatically generate the body of
isDependencyBreaking from scheduling predicate definitions. This would allow us
to expose the knowledge about dependency breaking instructions to the machine
schedulers (and, potentially, other codegen passes).

Differential Revision: https://reviews.llvm.org/D49310

llvm-svn: 338372
2018-07-31 13:21:43 +00:00
Peter Smith d9e74901fd [ELF][ARM] Add Arm ABI names for float ABI ELF Header flags
The ELF for the Arm architecture document defines, for EF_ARM_EABI_VER5 and
above, the flags EF_ARM_ABI_FLOAT_HARD and EF_ARM_ABI_FLOAT_SOFT. These
have been defined to be compatible with the existing EF_ARM_VFP_FLOAT and
EF_ARM_SOFT_FLOAT used by gcc for EF_ARM_EABI_UNKNOWN.

This patch adds the flags in addition to the existing ones so that any code
depending on the old names will still work.

Differential Revision: https://reviews.llvm.org/D49992

llvm-svn: 338370
2018-07-31 13:03:54 +00:00
Simon Pilgrim 0aa2867545 Revert r338365: [X86] Improved sched models for X86 BT*rr instructions.
https://reviews.llvm.org/D49243

Contains WIP code that should not have been included.

llvm-svn: 338369
2018-07-31 13:00:51 +00:00
Jonas Paulsson 2f12e45d5a [SystemZ] Improve decoding in case of instructions with four register operands.
Since z13, the max group size will be 2 if any μop has more than 3 register
sources.

This has been ignored sofar in the SystemZHazardRecognizer, but is now
handled by recognizing those instructions and adjusting the tracking of
decoding and the cost heuristic for grouping.

Review: Ulrich Weigand
https://reviews.llvm.org/D49847

llvm-svn: 338368
2018-07-31 13:00:42 +00:00
Sanjay Patel 9a801cb598 [InstCombine] simplify code for A & (A ^ B) --> A & ~B
This fold was written in an odd way and tried to avoid
an endless loop by bailing out on all constants instead
of the supposedly problematic case of -1. But (X & -1) 
should always be simplified before we reach here, so I'm
not sure how that is a problem.

There were no tests for the commuted patterns, so I added
those at rL338364.

llvm-svn: 338367
2018-07-31 13:00:03 +00:00
Andrew V. Tischenko e6f5ace81a [X86] Improved sched models for X86 BT*rr instructions.
https://reviews.llvm.org/D49243

llvm-svn: 338365
2018-07-31 12:33:48 +00:00
Sanjay Patel 995138ce60 [InstCombine] move/add tests for xor+add fold; NFC
llvm-svn: 338364
2018-07-31 12:31:00 +00:00
Andrew V. Tischenko e564055671 [X86] Improved sched models for X86 SHLD/SHRD* instructions.
Differential Revision: https://reviews.llvm.org/D9611

llvm-svn: 338359
2018-07-31 10:14:43 +00:00
Simon Pilgrim 99d475f97d [X86][SSE] isFNEG - Use getTargetConstantBitsFromNode to handle all constant cases
isFNEG was duplicating much of what was done by getTargetConstantBitsFromNode in its own calls to getTargetConstantFromNode.

Noticed while reviewing D48467.

llvm-svn: 338358
2018-07-31 10:13:17 +00:00
Martin Storsjo 293079f2de [ARM] Allow automatically deducing the thumb instruction size for .inst
This matches GAS, that allows unsuffixed .inst for thumb.

Differential Revision: https://reviews.llvm.org/D49937

llvm-svn: 338357
2018-07-31 09:27:07 +00:00
Martin Storsjo af18947f0a [ARM] Support the .inst directive for MachO and COFF targets
Contrary to ELF, we don't add any markers that distinguish data generated
with .short/.long from normal instructions, so the .inst directive only
adds compatibility with assembly that uses it.

Differential Revision: https://reviews.llvm.org/D49936

llvm-svn: 338356
2018-07-31 09:27:01 +00:00
Martin Storsjo 3e3d39d07e [AArch64] Support the .inst directive for MachO and COFF targets
Contrary to ELF, we don't add any markers that distinguish data generated
with .long from normal instructions, so the .inst directive only adds
compatibility with assembly that uses it.

Differential Revision: https://reviews.llvm.org/D49935

llvm-svn: 338355
2018-07-31 09:26:52 +00:00
Sam Parker 2a6c842fda [ARM] Revert r337821
Re-enabling ARMCodeGenPrepare by default after failing to reproduce
the bootstrap issues that I was concerned it was causing.

llvm-svn: 338354
2018-07-31 09:04:14 +00:00
Hsiangkai Wang 615540d0f2 Test commit.
llvm-svn: 338352
2018-07-31 06:09:29 +00:00
Hiroshi Inoue 2f6769be60 [InstSimplify] tests for D48828, D49981: fold extraction from std::pair
Minor touch up in the previous comment.

llvm-svn: 338351
2018-07-31 05:29:20 +00:00
Hiroshi Inoue 5427d3efc2 [InstSimplify] tests for D48828, D49981: fold extraction from std::pair
Updated unit tests for D48828 and D49981.

llvm-svn: 338350
2018-07-31 05:10:36 +00:00
Max Kazantsev eb8e9c0940 [NFC] Collect statistics in GuardWidening
llvm-svn: 338348
2018-07-31 04:37:11 +00:00
Diego Caballero 3587150fcb [VPlan] Introduce VPLoopInfo analysis.
The patch introduces loop analysis (VPLoopInfo/VPLoop) for VPBlockBases.
This analysis will be necessary to perform some H-CFG transformations and
detect and introduce regions representing a loop in the H-CFG.

Reviewers: fhahn, rengolin, mkuper, hfinkel, mssimpso

Reviewed By: fhahn 

Differential Revision: https://reviews.llvm.org/D48816

llvm-svn: 338346
2018-07-31 01:57:29 +00:00
Reid Kleckner d2bad6c639 Revert r338340 "[MS Demangler] Better demangling of template arguments."
Breaks the build with GCC, apparently.

llvm-svn: 338344
2018-07-31 01:08:42 +00:00
Craig Topper 9164b9b16e [X86] Stop accidentally running the Bonnell LEA fixup path on Goldmont.
In one place we checked X86Subtarget.slowLEA() to decide if the pass should run. But to decide what the pass should we only check isSLM. This resulted in Goldmont going down the Bonnell path.

llvm-svn: 338342
2018-07-31 00:43:54 +00:00
Ana Pazos 2baa767455 [RISCV] Fixed test case failure due to r338047
llvm-svn: 338341
2018-07-31 00:36:28 +00:00
Zachary Turner 4f85809a84 [MS Demangler] Better demangling of template arguments.
This patch fixes demangling of template aliases as template-template
arguments, and also fixes function pointers and references as
not type template parameters.  All of these can be properly
demangled now, so I've ported over the test
clang/test/CodeGenCXX/ms-template-callbacks.cpp.  All of these
tests pass

llvm-svn: 338340
2018-07-31 00:26:52 +00:00
Amara Emerson 1e8c164c63 [AArch64][GlobalISel] Add isel support for G_BLOCK_ADDR.
Also refactors some existing code to materialize addresses for the large code
model so it can be shared between G_GLOBAL_VALUE and G_BLOCK_ADDR.

This implements PR36390.

Differential Revision: https://reviews.llvm.org/D49903

llvm-svn: 338337
2018-07-31 00:09:02 +00:00
Amara Emerson 0e86c07077 [AArch64][GlobalISel] Make G_BLOCK_ADDR legal.
Differential Revision: https://reviews.llvm.org/D49902

llvm-svn: 338336
2018-07-31 00:08:56 +00:00
Amara Emerson 6aff5a7810 [GlobalISel] Add a G_BLOCK_ADDR opcode to handle IR blockaddress constants.
Differential Revision: https://reviews.llvm.org/D49900

llvm-svn: 338335
2018-07-31 00:08:50 +00:00
Zachary Turner a51403f5cc [MS Demangler] Add ms-return-qualifiers.test.
This is a copy of the tests from
clang/CodeGenCXX/ms-return-qualifiers.cpp converted to demangling
tests.

llvm-svn: 338330
2018-07-30 23:22:39 +00:00
Craig Topper 2f60ef2c78 [DAGCombiner][TargetLowering] Pass a SmallVector instead of a std::vector to BuildSDIV/BuildUDIV/etc.
The vector contains the SDNodes that these functions create. The number of nodes is always a small number so we should use SmallVector to avoid a heap allocation.

llvm-svn: 338329
2018-07-30 23:22:00 +00:00
Zachary Turner 931e879cef [MS Demangler] Add rudimentary C++11 Support
This patch adds support for demangling r-value references, new
operators such as the ""_foo operator, lambdas, alias types,
nullptr_t, and various other C++11'isms.

There is 1 failing test remaining in this file, which appears to
be related to back-referencing. This type of problem has the
potential to get ugly so I'd rather fix it in a separate patch.

Differential Revision: https://reviews.llvm.org/D50013

llvm-svn: 338324
2018-07-30 23:02:10 +00:00
Matt Davis 8d253a71c0 [llvm-mca][docs] Add instruction flow documentation. NFC.
Summary:
This patch mostly copies the existing Instruction Flow, and stage descriptions
from the mca README.  I made a few text tweaks, but no semantic changes,
and made reference to the "default pipeline."  I also removed the internals
references (e.g., reference to class names and header files).  I did leave the
LSUnit name around, but only as an abbreviated word for the load-store unit.


Reviewers: andreadb, courbet, RKSimon, gbedwell, filcab

Reviewed By: andreadb

Subscribers: tschuett, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D49692

llvm-svn: 338319
2018-07-30 22:30:14 +00:00
Sanjay Patel 9f807f44b1 [DAGCombiner] transform sub-of-shifted-signbit to add
This is exchanging a sub-of-1 with add-of-minus-1:
https://rise4fun.com/Alive/plKAH

This is another step towards improving select-of-constants codegen (see D48970).

x86 is the motivating target, and those diffs all appear to be wins. PPC and AArch64 look neutral.
I've limited this to early combining (!LegalOperations) in case a target wants to reverse it, but
I think canonicalizing to 'add' is more likely to produce further transforms because we have more
folds for 'add'.

Differential Revision: https://reviews.llvm.org/D49924

llvm-svn: 338317
2018-07-30 22:21:37 +00:00
Diego Caballero 2a34ac86d3 [VPlan] Introduce VPlan-based dominator analysis.
The patch introduces dominator analysis for VPBlockBases and extend
VPlan's GraphTraits specialization with the required interfaces. Dominator
analysis will be necessary to perform some H-CFG transformations and
to introduce VPLoopInfo (LoopInfo analysis on top of the VPlan representation).

Reviewers: fhahn, rengolin, mkuper, hfinkel, mssimpso

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D48815

llvm-svn: 338310
2018-07-30 21:33:31 +00:00
Alexandre Ganea 0bb8e89187 This fixes a crash when a second pass is required for the Codeview Type merging *and* the index points outside of the table (which should lead to an error being printed).
This occurs currently until MS precompiled headers .obj is added (see D45213)

Differential Revision: https://reviews.llvm.org/D50006

llvm-svn: 338308
2018-07-30 21:14:25 +00:00
Lang Hames 7361996f15 [ORC] Add SerializationTraits for std::set and std::map.
Also, make SerializationTraits for pairs forward the actual pair
template type arguments to the underlying serializer. This allows, for example,
std::pair<StringRef, bool> to be passed as an argument to an RPC call expecting
a std::pair<std::string, bool>, since there is an underlying serializer from
StringRef to std::string that can be used.

llvm-svn: 338305
2018-07-30 21:08:06 +00:00
Craig Topper 42d312bb83 [TargetLowering] In BuildSDIV, add the MULHS/SMUL_LOHI to the Created vector.
BuildUDIV was already correct.

llvm-svn: 338304
2018-07-30 21:04:38 +00:00
Craig Topper a568a27dfa [DAGCombiner][PowerPC][AArch64] Pass Created vector by reference to BuildSDIVPow2.
llvm-svn: 338303
2018-07-30 21:04:34 +00:00
David Bolvansky 6737b3a6a1 [InstCombine] Fold Select with binary op
Summary:
Fold
  %A = icmp eq i8 %x, 0
  %B = xor i8 %x, %z
  %C = select i1 %A, i8 %B, i8 %y
To
  %C = select i1 %A, i8 %z, i8 %y

Fixes https://bugs.llvm.org/show_bug.cgi?id=38345
Proof: https://rise4fun.com/Alive/43J

Reviewers: lebedev.ri, spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49954

llvm-svn: 338300
2018-07-30 20:38:53 +00:00
Craig Topper b94d5f853b Revert r338222 "[DAGCombiner] Remove unnecessary calls to AddToWorklist."
Thinking about it more it might be possible for the later nodes to be folded in getNode in such a way that the other created nodes are left dead. This can cause use counts to be incorrect on nodes that aren't dead.

So its probably safer to leave this alone.

llvm-svn: 338298
2018-07-30 20:27:10 +00:00
Vlad Tsyrklevich 1c7160e85f Revert "[GVNHoist] Re-enable GVNHoist by default"
This reverts commit r338240 because it was causing OOMs on the UBSan
buildbot when building clang/lib/Sema/SemaChecking.cpp

llvm-svn: 338297
2018-07-30 20:07:33 +00:00
Fangrui Song f78650a8de Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338293
2018-07-30 19:41:25 +00:00
Manoj Gupta 9d83ce9043 [Inline] Copy "null-pointer-is-valid" attribute in caller.
Summary:
Normally, inling does not happen if caller does not have
"null-pointer-is-valid"="true" attibute but callee has it.

However, alwaysinline may force callee to be inlined.
In this case, if the caller has the "null-pointer-is-valid"="true"
attribute, copy the attribute to caller.

Reviewers: efriedma, a.elovikov, lebedev.ri, jyknight

Reviewed By: efriedma

Subscribers: eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D50000

llvm-svn: 338292
2018-07-30 19:33:53 +00:00
David Bolvansky 76e95865f3 [InstSimplify] [NFC] Tests for Select with AND/OR fold
llvm-svn: 338285
2018-07-30 18:22:18 +00:00
Jessica Paquette fa3bee4756 [MachineOutliner][AArch64] Add support for saving LR to a register
This teaches the outliner to save LR to a register rather than the stack when
possible. This allows us to avoid bumping the stack in outlined functions in
some cases. By doing this, in a later patch, we can teach the outliner to do
something like this:

f1:
  ...
  bl OUTLINED_FUNCTION
  ...

f2:
  ...
  move LR's contents to a register
  bl OUTLINED_FUNCTION
  move the register's contents back

instead of falling back to saving LR in both cases.

llvm-svn: 338278
2018-07-30 17:45:28 +00:00
Craig Topper f014ec9b3b [X86] Fix typo in comment. NFC
llvm-svn: 338274
2018-07-30 17:34:31 +00:00
Craig Topper dd0ef801f8 Recommit r338204 "[X86] Correct the immediate cost for 'add/sub i64 %x, 0x80000000'."
This checks in a more direct way without triggering a UBSAN error.

llvm-svn: 338273
2018-07-30 17:29:57 +00:00
Jessica Paquette bbcc8895bb Add machine verifier to arm64-opt-remarks-lazy-bfi
Previously, I thought this was a Windows failure. Then I realized it failed on
every bot that used the verifier. This makes it use the verifier always, and
adds that pass to the pipeline checks so that it's consistent across all bots.

llvm-svn: 338272
2018-07-30 17:13:25 +00:00
David Bolvansky 2fa7fb14ea [DAGCombiner] Bug 31275- Extract a shift from a constant mul or udiv if a rotate can be formed
Summary:
Attempt to extract a shrl from a udiv or a shl from a mul if this allows a rotate to be formed.  This targets cases where the input to a rotate pattern was a mul or udiv by a constant and InstCombine merged one of the shifts with the op.

Patch by: sameconrad (Sam Conrad)

Reviewers: RKSimon, craig.topper, spatel, lebedev.ri, javed.absar

Reviewed By: lebedev.ri

Subscribers: efriedma, kparzysz, llvm-commits

Differential Revision: https://reviews.llvm.org/D47681

llvm-svn: 338270
2018-07-30 16:50:00 +00:00
Thomas Preud'homme 196149c943 Reapply "Fix crash on inline asm with 64bit matching input in 32bit GPR"
This reapplies commit r338206 reverted by r338214 since the bug that
r338206 uncovered has been fixed in r338268.

Add support for inline assembly with matching input operand that do not
naturally go in the register class it is constrained to (eg. double in a
32-bit GPR). Note that regular input is already handled by existing
code.

llvm-svn: 338269
2018-07-30 16:48:39 +00:00
Thomas Preud'homme 6c1b075299 Fix uninitialized read in ARM's PrintAsmOperand
Summary:
Fix read of uninitialized RC variable in ARM's PrintAsmOperand when
hasRegClassConstraint returns false. This was causing
inline-asm-operand-implicit-cast test to fail in r338206.

Reviewers: t.p.northover, weimingz, javed.absar, chill

Reviewed By: chill

Subscribers: chill, eraman, kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D49984

llvm-svn: 338268
2018-07-30 16:45:40 +00:00
Jessica Paquette 7816531f3c Attempt to fix Windows test failure caused by r338133
It seems like the pass pipeline on Windows is slightly different than on Linux
and macOS. As a result, the arm64-opt-remarks-lazy-bfi test has been failing.

This switches a CHECK-NEXT to a CHECK-DAG to try and get this running properly
again.

It'd be nice to switch it back to a CHECK-NEXT if possible, but the CHECK-NEXT
lines following the line we care about (the optimization remark emitter)
do a pretty good job of enforcing the ordering we want.

Hopefully this works, since I don't have a Windows machine. ;)

Example failure: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/11295

llvm-svn: 338267
2018-07-30 16:36:22 +00:00
Evandro Menezes a7d48286fb [SLC] Refactor the simplication of pow() (NFC)
Use more meaningful variable names.  Mostly NFC.

llvm-svn: 338266
2018-07-30 16:20:04 +00:00
Simon Pilgrim 186b62c9e4 [X86] Regenerate NOBMI/BMI combine-select tests.
Test cleanup for D38128

llvm-svn: 338265
2018-07-30 16:18:38 +00:00
Simon Pilgrim 2d5118432b [X86] Regenerate PKU test to merge 32/64-bit rdpkru checks
Test cleanup for D38128

llvm-svn: 338264
2018-07-30 16:15:18 +00:00
Simon Pilgrim 22ff9f94bb [X86] Regenerate fast-isel tests.
Test cleanup for D38128

llvm-svn: 338262
2018-07-30 16:13:40 +00:00
Sander de Smalen e64206a02c [AArch64][SVE] Asm: Enable instructions to be prefixed.
This patch enables instructions that are destructive on their
destination- and first source operand, to be prefixed with a
MOVPRFX instruction.

This patch also adds a variety of tests:

- positive tests for all instructions and forms that accept a
  movprfx for either or both predicated and unpredicated forms.

- negative tests for all instructions and forms that do not accept
  an unpredicated or predicated movprfx.

- negative tests for the diagnostics that get emitted when a MOVPRFX
  instruction is used incorrectly.

This is patch [2/2] in a series to add MOVPRFX instructions:
- Patch [1/2]: https://reviews.llvm.org/D49592
- Patch [2/2]: https://reviews.llvm.org/D49593

Reviewers: rengolin, SjoerdMeijer, samparker, fhahn, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D49593

llvm-svn: 338261
2018-07-30 16:05:45 +00:00
Sander de Smalen 9b33309c87 [AArch64][SVE] Asm: Add MOVPRFX instructions.
This patch adds predicated and unpredicated MOVPRFX instructions, which
can be prepended to SVE instructions that are destructive on their first
source operand, to make them a constructive operation, e.g.

  add z1.s, p0/m, z1.s, z2.s        <=> z1 = z1 + z2

can be made constructive:

  movprfx z0, z1
  add z0.s, p0/m, z0.s, z2.s        <=> z0 = z1 + z2

The predicated MOVPRFX instruction can additionally be used to zero
inactive elements, e.g.

  movprfx z0.s, p0/z, z1.s
  add z0.s, p0/m, z0.s, z2.s

Not all instructions can be prefixed with the MOVPRFX instruction
which is why this patch also adds a mechanism to validate prefixed
instructions. The exact rules when a MOVPRFX applies is detailed in
the SVE supplement of the Architectural Reference Manual.

This is patch [1/2] in a series to add MOVPRFX instructions:
- Patch [1/2]: https://reviews.llvm.org/D49592
- Patch [2/2]: https://reviews.llvm.org/D49593

Reviewers: rengolin, SjoerdMeijer, samparker, fhahn, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D49592

llvm-svn: 338258
2018-07-30 15:42:46 +00:00
David Bolvansky 403826ee0f [InstCombine] [NFC] Added tests for Select with binop fold
llvm-svn: 338257
2018-07-30 15:38:42 +00:00
Joel Galenson 764b3d8cc5 [doc] Fix Getting Started typo.
This makes it easier for someone to copy-paste this line, change the path, and run the command.

Differential Revision: https://reviews.llvm.org/D49201

llvm-svn: 338254
2018-07-30 15:14:24 +00:00
Krzysztof Parzyszek 24fae50905 [Hexagon] Simplify A4_rcmp[n]eqi R, 0
Consider cases when register R is known to be zero/non-zero, or when it
is defined by a C2_muxii instruction.

llvm-svn: 338251
2018-07-30 14:28:02 +00:00
John Brawn 898cd398d3 Adjust opt pass pipeline tests to cope with combination of r338240 and r338242
The combination of r338240 and r338242 causes the opt pass pipeline tests to
fail because of how r338242 makes BasicAA be invalidated more often. Adjust the
tests to reflect this.

llvm-svn: 338250
2018-07-30 14:26:24 +00:00
Matt Arsenault de496c32a4 AMDGPU: Reduce code size with fcanonicalize (fneg x)
When fcanonicalize is lowered to a mul, we can
use -1.0 for free and avoid the cost of the bigger
encoding for source modifers.

llvm-svn: 338244
2018-07-30 12:16:58 +00:00
Matt Arsenault f3c9a34def AMDGPU: Make fneg combine handle fcanonicalize
llvm-svn: 338243
2018-07-30 12:16:47 +00:00
John Brawn cd73fe8989 [BasicAA] Use PhiValuesAnalysis if available when handling phi alias
By using PhiValuesAnalysis we can get all the values reachable from a phi, so
we can be more precise instead of giving up when a phi has phi operands. We
can't make BaseicAA directly use PhiValuesAnalysis though, as the user of
BasicAA may modify the function in ways that PhiValuesAnalysis can't cope with.

For this optional usage to work correctly BasicAAWrapperPass now needs to be not
marked as CFG-only (i.e. it is now invalidated even when CFG is preserved) due
to how the legacy pass manager handles dependent passes being invalidated,
namely the depending pass still has a pointer to the now-dead dependent pass.

Differential Revision: https://reviews.llvm.org/D44564

llvm-svn: 338242
2018-07-30 11:52:08 +00:00
Alexandros Lamprineas de3ca964c1 [GVNHoist] Re-enable GVNHoist by default
My initial motivation for this came from https://reviews.llvm.org/D48122,
where it was pointed out that my change didn't fit well in SimplifyCFG and
therefore using GVNHoist was a better way to go. GVNHoist has been disabled
for a while as there was a list of bugs related to it.

I have fixed the following bugs:

https://bugs.llvm.org/show_bug.cgi?id=37808 -> https://reviews.llvm.org/D48372 (rL337149)
https://bugs.llvm.org/show_bug.cgi?id=36787 -> https://reviews.llvm.org/D49555 (rL337674)
https://bugs.llvm.org/show_bug.cgi?id=37445 -> https://reviews.llvm.org/D49425 (rL337680)

The next two bugs no longer occur, and it's unclear which commit fixed them:

https://bugs.llvm.org/show_bug.cgi?id=36635
https://bugs.llvm.org/show_bug.cgi?id=37791

I investigated this one and proved to be unrelated to GVNHoist, but a genuine bug in NewGvn:

https://bugs.llvm.org/show_bug.cgi?id=37660

To convince myself GVNHoist is in a good state I made a successful bootstrap build of LLVM.
Merging this change now in order to make it to the LLVM 7.0.0 branch.

Differential Revision: https://reviews.llvm.org/D49858

llvm-svn: 338240
2018-07-30 10:50:18 +00:00
Francis Visoiu Mistrih 7d003657de [MachineOutliner][X86] Use TAILJMPd64 instead of JMP_1 for TailCall construction
The machine verifier asserts with:

Assertion failed: (isMBB() && "Wrong MachineOperand accessor"), function getMBB, file ../include/llvm/CodeGen/MachineOperand.h, line 542.

It calls analyzeBranch which tries to call getMBB if the opcode is
JMP_1, but in this case we do:

JMP_1 @OUTLINED_FUNCTION

I believe we have to use TAILJMPd64 instead of JMP_1 since JMP_1 is used
with brtarget8.

Differential Revision: https://reviews.llvm.org/D49299

llvm-svn: 338237
2018-07-30 09:59:33 +00:00
Dean Michael Berris 927b3da6c9 Revert "[X86] Correct the immediate cost for 'add/sub i64 %x, 0x80000000'."
This reverts commit r338204.

llvm-svn: 338236
2018-07-30 09:45:09 +00:00
Nicolai Haehnle 7f0d05d532 AMDGPU: Force skip over s_sendmsg and exp instructions
Summary:
These instructions interact with hardware blocks outside the shader core,
and they can have "scalar" side effects even when EXEC = 0. We don't
want these scalar side effects to occur when all lanes want to skip
these instructions, so always add the execz skip branch instruction
for basic blocks that contain them.

Also ensure that we skip scalar stores / atomics, though we don't
code-gen those yet.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D48431

Change-Id: Ieaeb58352e2789ffd64745603c14970c60819d44
llvm-svn: 338235
2018-07-30 09:23:59 +00:00
Petr Pavlu 8b6eff4e77 [ARM] Fix over-alignment in arguments that are HA of 128-bit vectors
Code in `CC_ARM_AAPCS_Custom_Aggregate()` is responsible for handling
homogeneous aggregates for `CC_ARM_AAPCS_VFP`. When an aggregate ends up
fully on stack, the function tries to pack all resulting items of the
aggregate as tightly as possible according to AAPCS.

Once the first item was laid out, the alignment used for consecutive
items was the size of one item. This logic went wrong for 128-bit
vectors because their alignment is normally only 64 bits, and so could
result in inserting unexpected padding between the first and second
element.

The patch fixes the problem by updating the alignment with the item size
only if this results in reducing it.

Differential Revision: https://reviews.llvm.org/D49720

llvm-svn: 338233
2018-07-30 08:49:30 +00:00
Karl-Johan Karlsson e5899447b4 [RegisterScavenger] Fix debug print
llvm-svn: 338231
2018-07-30 08:17:00 +00:00
Max Kazantsev 3327bcaeb1 [NFC] Prepare GuardWidening for widening of cond branches
llvm-svn: 338229
2018-07-30 07:07:32 +00:00
Zachary Turner a6869515cd Try to fix build.
llvm-svn: 338227
2018-07-30 03:25:27 +00:00
Zachary Turner 71c91f948a [MS Demangler] Demangle symbols in function scopes.
There are a couple of issues you run into when you start getting into
more complex names, especially with regards to function local statics.
When you've got something like:

    int x() {
      static int n = 0;
      return n;
    }

Then this needs to demangle to something like

    int `int __cdecl x()'::`1'::n

The nested mangled symbols (e.g. `int __cdecl x()` in the above
example) also share state with regards to back-referencing, so
we need to be able to re-use the demangler in the middle of
demangling a symbol while sharing back-ref state.

To make matters more complicated, there are a lot of ambiguities
when demangling a symbol's qualified name, because a function local
scope pattern (usually something like `?1??name?`) looks suspiciously
like many other possible things that can occur, such as `?1` meaning
the second back-ref and disambiguating these cases is rather
interesting.  The `?1?` in a local scope pattern is actually a special
case of the more general pattern of `? + <encoded number> + ?`, where
"encoded number" can itself have embedded `@` symbols, which is a
common delimeter in mangled names.  So we have to take care during the
disambiguation, which is the reason for the overly complicated
`isLocalScopePattern` function in this patch.

I've added some pretty obnoxious tests to exercise all of this, which
exposed several other problems related to back-referencing, so those
are fixed here as well. Finally, I've uncommented some tests that were
previously marked as `FIXME`, since now these work.

Differential Revision: https://reviews.llvm.org/D49965

llvm-svn: 338226
2018-07-30 03:12:34 +00:00
Craig Topper e978d2ee4a [DAGCombiner] Remove unnecessary calls to AddToWorklist.
The DAGCombiner has a mechanism for ensuring all nodes have been visited at least once. Every time a node is visited, it makes sure its operands have been in the worklist at least once. This ensures that when multiple nodes are created by a combine, only the last node needs to be returned. The earlier nodes can all be found Through this operand check. These means we don't need to explicitly add nodes to the worklist when a combine creates multiple nodes.

I've removed the most obvious cases here. There are probably more than can be removed.

llvm-svn: 338222
2018-07-29 18:39:26 +00:00
Sanjay Patel 577c705752 [InstCombine] try to fold 'add+sub' to 'not+add'
These are reassociated versions of the same pattern and
similar transforms as in rL338200 and rL338118.

The motivation is identical to those commits:
Patterns with add/sub combos can be improved using
'not' ops. This is better for analysis and may lead
to follow-on transforms because 'xor' and 'add' are
commutative/associative. It can also help codegen.

llvm-svn: 338221
2018-07-29 18:13:16 +00:00
Sanjay Patel 2daf28f9ce [InstCombine] add tests for another sub-not variant; NFC
llvm-svn: 338220
2018-07-29 18:07:28 +00:00
Zachary Turner 316109b5b8 [MS Demangler] NFC - Remove state from Demangler class.
We need to be able to initiate a nested demangling from inside
of an "outer" demangling.  These need to be able to share some
state, such as back-references.  As a result, we can't store
things like the output stream or the mangled name in the Demangler
class, since each demangling will have different values.  So
remove this state and pass it through the necessary methods.

llvm-svn: 338219
2018-07-29 16:38:02 +00:00
Sanjay Patel 54421ce918 [InstSimplify] fold funnel shifts with 0-shift amount
llvm-svn: 338218
2018-07-29 16:36:38 +00:00
Sanjay Patel 46af5835af [InstSimplify] add tests for funnel shift intrinsics; NFC
llvm-svn: 338217
2018-07-29 16:27:17 +00:00
Jonas Devlieghere ae1727e3dd [dsymutil] Simplify temporary file handling.
Dsymutil's update functionality was broken on Windows because we tried
to rename a file while we're holding open handles to that file. TempFile
provides a solution for this through its keep(Twine) method. This patch
changes dsymutil to make use of that functionality.

Differential revision: https://reviews.llvm.org/D49860

llvm-svn: 338216
2018-07-29 14:56:15 +00:00
Sanjay Patel f52eeb1123 [InstSimplify] refactor intrinsic simplifications; NFCI
llvm-svn: 338215
2018-07-29 14:42:08 +00:00
Sanjay Patel 7312206f2f revert r338206 because the test does not pass
Example of bot failure:
http://lab.llvm.org:8011/builders/clang-cmake-armv8-quick/builds/5107/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Ainline-asm-operand-implicit-cast.ll

llvm-svn: 338214
2018-07-29 14:30:49 +00:00
Dylan McKay 6bc5d5c6db [AVR] Re-enable expansion of ADDE/ADDC/SUBE/SUBC in ISel
This was disabled in r333748, which broke four tests.

In the future, these need to be updated to UADDO/ADDCARRY or
USUBO/SUBCARRY.

llvm-svn: 338212
2018-07-29 11:38:36 +00:00
Sander de Smalen ad88a99956 [AArch64][SVE] Asm: Support for WHILE(LE|LO|LS|LT) instructions.
The WHILE instructions generate a predicate that is true while the 
comparison of the first scalar operand (incremented for each predicate
element) with the second scalar operand is true and false thereafter.

  WHILELE  While incrementing signed scalar less than or equal to scalar
  WHILELO  While incrementing unsigned scalar lower than scalar
  WHILELS  While incrementing unsigned scalar lower than or same as scalar
  WHILELT  While incrementing signed scalar less than scalar

e.g.

  whilele  p0.s, x0, x1

  generates predicate p0 (for 32bit elements) by incrementing
  (signed) x0 and comparing that vector to splat(x1).

llvm-svn: 338211
2018-07-29 08:51:08 +00:00
Sander de Smalen e70ed3187c [AArch64][SVE] Asm: Instructions to perform serialized operations.
The instructions added in this patch permit active elements within
a vector to be processed sequentially without unpacking the vector.

  PFIRST      Set the first active element to true.
  PNEXT       Find next active element in predicate.
  CTERMEQ     Compare and terminate loop when equal.
  CTERMNE     Compare and terminate loop when not equal.

llvm-svn: 338210
2018-07-29 08:00:16 +00:00
Zachary Turner a7dffb139c [MS Demangler] Refactor some of the name parsing code.
There are some very subtle differences between how one should
parse symbol names and type names.  They differ with respect
to back-referencing, the set of legal values that can appear
as the unqualified portion, and various other aspects.

By separating the parsing code into separate paths, we can
remove a lot of ambiguity during the demangling process, which
is necessary for demangling more complicated things like
function local statics, nested classes, and lambdas.

llvm-svn: 338207
2018-07-28 22:10:42 +00:00
Thomas Preud'homme 74ffd14e15 Fix crash on inline asm with 64bit matching input in 32bit GPR
Add support for inline assembly with matching input operand that do not
naturally go in the register class it is constrained to (eg. double in a
32-bit GPR). Note that regular input is already handled by existing
code.

llvm-svn: 338206
2018-07-28 21:33:39 +00:00
Craig Topper 9db3573d3a [SelectionDAG] Pass std::vector by reference instead of by pointer to BuildSDIV/BuildUDIV.
This removes the need for an assert to ensure the pointer isn't null.

Years ago we had ifs the checked the pointer was non-null before very access to the vector. These checks were removed and replaced with a single assert. But a reference seems more suitable here.

llvm-svn: 338205
2018-07-28 19:44:20 +00:00
Craig Topper 5daa032546 [X86] Correct the immediate cost for 'add/sub i64 %x, 0x80000000'.
X86 normally requires immediates to be a signed 32-bit value which would exclude i64 0x80000000. But for add/sub we can negate the constant and use the opposite instruction.

llvm-svn: 338204
2018-07-28 18:21:46 +00:00
Craig Topper ba208b07b6 [X86] Use alignTo and divideCeil to make some code more readable. NFC
llvm-svn: 338203
2018-07-28 18:21:45 +00:00
Zachary Turner fea073f4e6 Add VS natvis support for LLVMDemangle's StringView.
llvm-svn: 338202
2018-07-28 17:25:42 +00:00
David Bolvansky f801a58c0d [InstCombine] Tests for fold Select with binary op
Differential Revision: https://reviews.llvm.org/D49961

llvm-svn: 338201
2018-07-28 17:13:33 +00:00
Sanjay Patel 818b253d3a [InstCombine] try to fold 'sub' to 'not'
https://rise4fun.com/Alive/jDd

Patterns with add/sub combos can be improved using
'not' ops. This is better for analysis and may lead
to follow-on transforms because 'xor' and 'add' are 
commutative/associative. It can also help codegen.  

llvm-svn: 338200
2018-07-28 16:48:44 +00:00
Sander de Smalen 5b3a289424 [AArch64][SVE] Asm: Support for PFALSE and PTEST instructions.
This patch adds PFALSE (unconditionally sets all elements of
the predicate to false) and PTEST (set the status flags for the
predicate).

llvm-svn: 338198
2018-07-28 14:18:11 +00:00
Matt Arsenault 8f9dde94b7 AMDGPU: Stop wasting argument registers with v3i32/v3f32
SelectionDAGBuilder widens v3i32/v3f32 arguments to
to v4i32/v4f32 which consume an additional register.
In addition to wasting argument space, this produces extra
instructions since now it appears the 4th vector component has
a meaningful value to most combines.

llvm-svn: 338197
2018-07-28 14:11:34 +00:00
Sander de Smalen 3878bf83dd [AArch64][SVE] Asm: Data-dependent loop predicate partitioning instructions.
This patch adds support for instructions that partition a predicate
based on data-dependent termination conditions in a loop.

  BRKA      Break after the first true condition
  BRKAS     Break after the first true condition, setting condition flags
  BRKB      Break before the first true condition
  BRKBS     Break before the first true condition, setting condition flags

  BRKPA     Break after the first true condition, propagating from the 
            previous partition
  BRKPAS    Break after the first true condition, propagating from the 
            previous partition, setting condition flags
  BRKPB     Break before the first true condition, propagating from the 
            previous partition
  BRKPBS    Break before the first true condition, propagating from the 
            previous partition, setting condition flags

  BRKN      Propagate break to next partition
  BKRNS     Propagate break to next partition, setting condition flags

llvm-svn: 338196
2018-07-28 14:04:52 +00:00
David Bolvansky 9800b710c2 [InstSimplify] Moved Select + AND/OR tests from InstCombine
Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49957

llvm-svn: 338195
2018-07-28 13:52:45 +00:00
Matt Arsenault 81920b0a25 DAG: Add calling convention argument to calling convention funcs
This seems like a pretty glaring omission, and AMDGPU
wants to treat kernels differently from other calling
conventions.

llvm-svn: 338194
2018-07-28 13:25:19 +00:00
Matt Arsenault 72b0e38b26 AMDGPU: Stop trying to extend arguments for clover
This was trying to replace i8/i16 arguments with i32, which
was broken and no longer necessary.

llvm-svn: 338193
2018-07-28 12:34:25 +00:00
David Green fc4b0fe0a2 [GlobalOpt] Test array indices inside structs for out-of-bounds accesses
We now, from clang, can turn arrays of
  static short g_data[] = {16, 16, 16, 16, 16, 16, 16, 16, 0, 0, 0, 0, 0, 0, 0, 0};
into structs of the form
  @g_data = internal global <{ [8 x i16], [8 x i16] }> ...

GlobalOpt will incorrectly SROA it, not realising that the access to the first
element may overflow into the second. This fixes it by checking geps more
thoroughly.

I believe this makes the globalsra-partial.ll test case invalid as the %i value
could be out of bounds. I've re-purposed it as a negative test for this case.

Differential Revision: https://reviews.llvm.org/D49816

llvm-svn: 338192
2018-07-28 08:20:10 +00:00
David Bolvansky f947608ddf [InstCombine] Fold Select with AND/OR condition
Summary:
Fold
```
%A = icmp ne i8 %X, %V1
%B = icmp ne i8 %X, %V2
%C = or i1 %A, %B
%D = select i1 %C, i8 %X, i8 %V1
ret i8 %D
  =>
ret i8 %X

Fixes https://bugs.llvm.org/show_bug.cgi?id=38334
Proof: https://rise4fun.com/Alive/plI8

Reviewers: spatel, lebedev.ri

Reviewed By: lebedev.ri

Subscribers: craig.topper, llvm-commits

Differential Revision: https://reviews.llvm.org/D49919

llvm-svn: 338191
2018-07-28 06:55:51 +00:00
Erik Pilkington 256db4b799 [demangler] Fix an oss-fuzz bug from r338138
Stack overflow on invalid. While collapsing references, we were skipping over a
cycle check in ForwardTemplateReference leading to a stack overflow. This commit
fixes the problem by duplicating the cycle check in ReferenceType.

llvm-svn: 338190
2018-07-28 04:06:30 +00:00
Jakub Kuderski 81db59b6f1 [Dominators] Make applyUpdate's documentation less confusing [NFC]
Summary:
It was pointed out by @chandlerc that it's not clear whether both applyUpdates and insert/deleteEdge can be used to perform multiple updates.

IMO, the confusing part was that the comment above applyUpdates made a comparison of expected update time between calling it and calling insert/deleteEdge multiple times. It's generally not possible to safely call insert/deleteEdge multiple times, which documentation for each of the 3 functions warns about, so the whole comparison makes very little sense. On top of that, the comment is already lengthy, so I think it's best to just get rid of this comparison.

Reviewers: chandlerc, asbirlea, NutshellySima, grosser

Reviewed By: chandlerc

Subscribers: llvm-commits, chandlerc

Differential Revision: https://reviews.llvm.org/D49944

llvm-svn: 338184
2018-07-28 00:54:07 +00:00
Vedant Kumar 8a05b01d13 [docs] Clarify role of DIExpressions within debug intrinsics
This should make the semantics of DIExpressions within llvm.dbg.{addr,
declare, value} easier to understand.

Differential Revision: https://reviews.llvm.org/D49572

llvm-svn: 338182
2018-07-28 00:33:47 +00:00
Craig Topper 50b1d4303d [DAGCombiner] Teach DAG combiner that A-(B-C) can be folded to A+(C-B)
This can be useful since addition is commutable, and subtraction is not.

This matches a transform that is also done by InstCombine.

llvm-svn: 338181
2018-07-28 00:27:25 +00:00
Alina Sbirlea 5666c7e4bd [SimpleLoopUnswitch] Fix DT updates for trivial branch unswitching.
Summary:
Fixing 2 issues with the DT update in trivial branch switching, though I don't have a case where DT update fails.
1. After splitting ParentBB->UnswitchedBB edge, new edges become: ParentBB->LoopExitBB->UnswitchedBB, so remove ParentBB->LoopExitBB edge.
2. AFAIU, for multiple CFG changes, DT should be updated using batch updates, vs consecutive addEdge and removeEdge calls.

Reviewers: chandlerc, kuhar

Subscribers: sanjoy, jlebar, llvm-commits

Differential Revision: https://reviews.llvm.org/D49925

llvm-svn: 338180
2018-07-28 00:01:05 +00:00
Wouter van Oortmerssen a90d24da1c Revert "[WebAssembly] Added default stack-only instruction mode for MC."
This reverts commit d3c9af4179eae7793d1487d652e2d4e23844555f.
(SVN revision 338164)

llvm-svn: 338176
2018-07-27 23:19:51 +00:00
Fangrui Song c39ea92359 [Support] Remove unnecessary MemoryBuffer::anchor (where the destructor serves as the key function)
llvm-svn: 338175
2018-07-27 23:12:11 +00:00
Craig Topper c3e11bf3f7 [X86] Add support expanding multiplies by constant where the constant is -3/-5/-9 multplied by a power of 2.
These can be replaced with an LEA, a shift, and a negate. This seems to match what gcc and icc would do.

llvm-svn: 338174
2018-07-27 23:04:59 +00:00
Fangrui Song fdfe2a9236 [llvm-objcopy] Make --strip-debug strip .zdebug* (zlib-gnu) sections
This behavior matches GNU objcopy.

llvm-svn: 338173
2018-07-27 22:51:36 +00:00
Reid Kleckner ba82788ff6 [InstrProf] Don't register __llvm_profile_runtime_user
Refactor some FileCheck prefixes while I'm at it.

Fixes PR38340

llvm-svn: 338172
2018-07-27 22:21:35 +00:00
Wouter van Oortmerssen a67c4137c3 [WebAssembly] Added default stack-only instruction mode for MC.
Summary:
Moved Explicit Locals pass to last.
Made that pass obligatory.
Made it convert from register to stack based instructions, and removed the registers.
Fixes to related code that was expecting register based instructions.
Added the correct testing flag to all tests, depending on what the
format they were expecting so far.
Translated one test to stack format as example: reg-stackify-stack.ll

tested:
llvm-lit -v `find test -name WebAssembly`
unittests/MC/*

Reviewers: dschuff, sunfish

Subscribers: sbc100, jgravelle-google, eraman, aheejin, llvm-commits

Differential Revision: https://reviews.llvm.org/D49160

llvm-svn: 338164
2018-07-27 20:56:43 +00:00
David Bolvansky 173484d78c [InstCombine] [NFC] [Tests] Fold Select with AND/OR condition - fixed
Differential Revision: https://reviews.llvm.org/D49933

llvm-svn: 338161
2018-07-27 20:29:32 +00:00
Jessica Paquette f90edbe3d6 Recommit "Enable MachineOutliner by default under -Oz for AArch64"
Fixed the ASAN failure from before in r338148, so recommiting.

This patch enables the MachineOutliner by default in AArch64 under -Oz.

The MachineOutliner offers around a 4.5% improvement on the current -Oz code
size improvements.

We have done work into improving the debuggability of outlined code, so that
users of -Oz won't be surprised by the optimization. We have also been executing
the LLVM test suite and common external tests such as the SPEC suites
continuously with no issue. The outliner has a low compile-time overhead of
roughly 1%. At this point, the outliner would be a really good addition to the
-Oz pass pipeline!

llvm-svn: 338160
2018-07-27 20:18:27 +00:00
David Bolvansky 1b82617473 [InstCombine] [NFC] [Tests] Fold Select with AND/OR condition
Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49932

llvm-svn: 338159
2018-07-27 20:18:12 +00:00
Evandro Menezes f611ce82c0 [SLC] Test simplification of pow(x, 0.333...) to cbrt(x) (NFC)
Add test case for simplifying `pow(x, 0.333...)` into `cbrt(x)`, which
D49040 enables.

llvm-svn: 338152
2018-07-27 18:56:47 +00:00
Sanjay Patel 06c7d5aef6 [AArch64, PowerPC, x86] add more signbit math tests; NFC
The tests with a constant sub operand were added with rL338143,
but the potential transform doesn't have that requirement, so
adding more tests with variable operands.

llvm-svn: 338150
2018-07-27 18:31:21 +00:00
Jessica Paquette 9d93c6026a [MachineOutliner] Exit getOutliningCandidateInfo when we erase all candidates
There was a missing check for if a candidate list was entirely deleted. This
adds that check.

This fixes an asan failure caused by running test/CodeGen/AArch64/addsub_ext.ll
with the MachineOutliner enabled.

llvm-svn: 338148
2018-07-27 18:21:57 +00:00
Evandro Menezes fcca45f0dd [ARM] Add new target feature to fuse literal generation
This feature enables the fusion of such operations on Cortex A57 and Cortex
A72, as recommended in their Software Optimisation Guides, sections 4.14 and
4.11, respectively.

Differential revision: https://reviews.llvm.org/D49563

llvm-svn: 338147
2018-07-27 18:16:47 +00:00
Sanjay Patel efac39eef6 [AArch64, PowerPC, x86] add more signbit math tests; NFC
llvm-svn: 338143
2018-07-27 18:12:29 +00:00
Erik Pilkington 3a6fed4a7b [demangler] Support for reference collapsing
llvm.org/PR38323

llvm-svn: 338138
2018-07-27 17:27:40 +00:00
Jessica Paquette faea2d3130 Revert "Enable MachineOutliner by default under -Oz for AArch64"
It failed an Asan test on a bot:

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/21543/steps/check-llvm%20asan/logs/stdio

Fixing that before recommitting.

llvm-svn: 338136
2018-07-27 17:25:38 +00:00
Yonghong Song 04ccfda075 bpf: add missing RegState to notify MachineInstr verifier necessary register usage
Errors like the following are reported by:

  https://urldefense.proofpoint.com/v2/url?u=http-3A__lab.llvm.org-3A8011_builders_llvm-2Dclang-2Dx86-5F64-2Dexpensive-2Dchecks-2Dwin_builds_11261&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=929oWPCf7Bf2qQnir4GBtowB8ZAlIRWsAdTfRkDaK-g&s=9k-wbEUVpUm474hhzsmAO29VXVvbxJPWD9RTgCD71fQ&e=

  *** Bad machine code: Explicit definition marked as use ***
  - function:    cal_align1
  - basic block: %bb.0 entry (0x47edd98)
  - instruction: LDB $r3, $r2, 0
  - operand 0:   $r3

This is because RegState info was missing for ScratchReg inside
expandMEMCPY. This caused incomplete register usage information to
MachineInstr verifier which then would complain as there could be potential
code-gen issue if the complained MachineInstr is used in place where
register usage information matters even though the memcpy expanding is not
in such case as it happens at the last stage of IR optimization pipeline.

We should always specify those register usage information which compiler
couldn't deduct automatically whenever we add a hardware register manually.

Reported-by: Builder llvm-clang-x86_64-expensive-checks-win Build #11261
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 338134
2018-07-27 16:58:52 +00:00
Jessica Paquette d4229b985c Enable MachineOutliner by default under -Oz for AArch64
This patch enables the MachineOutliner by default in AArch64 under -Oz.

The MachineOutliner offers around a 4.5% improvement on the current -Oz code
size improvements.

We have done work into improving the debuggability of outlined code, so that
users of -Oz won't be surprised by the optimization. We have also been executing
the LLVM test suite and common external tests such as the SPEC suites
continuously with no issue. The outliner has a low compile-time overhead of
roughly 1%. At this point, the outliner would be a really good addition to the
-Oz pass pipeline!

llvm-svn: 338133
2018-07-27 16:44:42 +00:00
Sanjay Patel c7abb416dc [DAGCombiner] fold 'not' with signbit math
This is a follow-up suggested in D48970. 

Alive proofs:
https://rise4fun.com/Alive/sII

We can eliminate an instruction in the usual select-of-constants 
to bit hack transform by adjusting the add/sub with constant.
This is always a win. 

There are more transforms that are likely wins, but they may need 
target hooks in case some targets do not benefit. 

This is another step towards making up for canonicalizing to 
select-of-constants in rL331486.

llvm-svn: 338132
2018-07-27 16:42:55 +00:00
Sanjay Patel 1812d33e22 [x86] add more tests for signbit math; NFC
llvm-svn: 338131
2018-07-27 16:22:40 +00:00
Sanjay Patel 60c04b961e [PowerPC] add more tests for signbit math; NFC
llvm-svn: 338130
2018-07-27 16:22:18 +00:00
Sanjay Patel f815bc658b [AArch64] add more tests for signbit math; NFC
llvm-svn: 338129
2018-07-27 16:21:56 +00:00
Fangrui Song 9c85d7acbe [Support] Use unsigned char for xxHash 64-bit
Before, the last 3 bytes were char-signedness dependent.

llvm-svn: 338128
2018-07-27 16:01:09 +00:00