Commit Graph

117466 Commits

Author SHA1 Message Date
Zachary Turner 5bba1cafbe Better support for POSIX paths in PDBs.
This a resubmission of a patch which was previously reverted
due to breaking several lld tests.  The issues causing those
failures have been fixed, so the patch is now resubmitted.

---Original Commit Message---

While it doesn't make a *ton* of sense for POSIX paths to be
in PDBs, it's possible to occur in real scenarios involving
cross compilation.

The tools need to be able to handle this, because certain types
of debugging scenarios are possible without a running process
and so don't necessarily require you to be on a Windows system.
These include post-mortem debugging and binary forensics (e.g.
using a debugger to disassemble functions and examine symbols
without running the process).

There's changes in clang, LLD, and lldb in this patch.  After
this the cross-platform disassembly and source-list tests pass
on Linux.

Furthermore, the behavior of LLD can now be summarized by a much
simpler rule than before: Unless you specify /pdbsourcepath and
/pdbaltpath, the PDB ends up with paths that are valid within
the context of the machine that the link is performed on.

Differential Revision: https://reviews.llvm.org/D53149

llvm-svn: 344377
2018-10-12 17:26:19 +00:00
Fangrui Song 2c31e12d62 [BPF] Some fixes after rL344366
* Move #include outside of namespaces
* Add missing #include
* Add out-of-line virtual destructor to BTFTypeEntry

designated initializers should also be fixed

llvm-svn: 344376
2018-10-12 17:23:25 +00:00
Nick Desaulniers 727891c918 [Support] exit with custom return code for SIGPIPE
Summary:
We tell the user to file a bug report on LLVM right now, and
SIGPIPE isn't LLVM's fault so our error message is wrong.

Allows frontends to detect SIGPIPE from writing to closed readers.
This can be seen commonly from piping into head, tee, or split.

Fixes PR25349, rdar://problem/14285346, b/77310947

Reviewers: jfb

Reviewed By: jfb

Subscribers: majnemer, kristina, llvm-commits, thakis, srhines

Differential Revision: https://reviews.llvm.org/D53000

llvm-svn: 344372
2018-10-12 17:22:07 +00:00
Yonghong Song 6c2327a09e [BPF] Add BTF generation for BPF target
BTF is the debug format for BPF, a kernel virtual machine
and widely used for tracing, networking and security, etc ([1]).

Currently only instruction streams are passed to kernel,
the kernel verifier verifies them before execution. In order to
provide better visibility of bpf programs to user space
tools, some debug information, e.g., function names and
debug line information are desirable for kernel so tools
can get such information with better annotation
for jited instructions for performance or other reasons.

The dwarf is too complicated in kernel and for BPF.
Hence, BTF is designed to be the debug format for BPF ([2]).
Right now, pahole supports BTF for types, which
are generated based on dwarf sections in the ELF file.

In order to annotate performance metrics for jited bpf insns,
it is necessary to pass debug line info to the kernel.
Furthermore, we want to pass the actual code to the
kernel because of the following reasons:

. bpf program typically is small so storage overhead
  should be small.
. in bpf land, it is totally possible that
  an application loads the bpf program into the
  kernel and then that application quits, so
  holding debug info by the user space application
  is not practical.
. having source codes directly kept by kernel
  would ease deployment since the original source
  code does not need ship on every hosts and
  kernel-devel package does not need to be
  deployed even if kernel headers are used.

The only reliable time to get the source code is
during compilation time. This will result in both more
accurate information and easier deployment as
stated in the above.

Another consideration is for JIT. The project like bcc
use MCJIT to compile a C program into bpf insns and
load them to the kernel ([3]). The generated BTF sections
will be readily available for such cases as well.

This patch implemented generation of BTF info in llvm
compiler. The BTF related sections will be generated
when both -target bpf and -g are specified. Two sections
are generated:
  .BTF contains all the type and string information, and
  .BTF.ext contains the func_info and line_info.

The separation is related to how two sections are used
differently in bpf loader, e.g., linux libbpf ([4]).
The .BTF section can be loaded into the kernel directly
while .BTF.ext needs loader manipulation before loading
to the kernel. The format of the each section is roughly
defined in llvm:include/llvm/MC/MCBTFContext.h and
from the implementation in llvm:lib/MC/MCBTFContext.cpp.
A later example also shows the contents in each section.

The type and func_info are gathered during CodeGen/AsmPrinter
by traversing dwarf debug_info. The line_info is
gathered in MCObjectStreamer before writing to
the object file. After all the information is gathered,
the two sections are emitted in MCObjectStreamer::finishImpl.

With cmake CMAKE_BUILD_TYPE=Debug, the compiler can
dump out all the tables except insn offset, which
will be resolved later as relocation records.
The debug type "btf" is used for BTFContext dump.

Dwarf tests the debug info generation with
llvm-dwarfdump to decode the binary sections and
check whether the result is expected. Currently
we do not have such a tool yet. We will implement
btf dump functionality in bpftool ([5]) as the bpftool is
considered the recommended tool for bpf introspection.
The implementation for type and func_info is tested
with linux kernel test cases. The line_info is visually
checked with dump from linux kernel libbpf ([4]) and
checked with readelf dumping section raw data.

Note that the .BTF and .BTF.ext information will not
be emitted to assembly code and there is no assembler
support for BTF either.

In the below, with a clang/llvm built with CMAKE_BUILD_TYPE=Debug,
Each table contents are shown for a simple C program.

  -bash-4.2$ cat -n test.c
     1  struct A {
     2    int a;
     3    char b;
     4  };
     5
     6  int test(struct A *t) {
     7    return t->a;
     8  }
  -bash-4.2$ clang -O2 -target bpf -g -mllvm -debug-only=btf -c test.c
  Type Table:
  [1] FUNC name_off=1 info=0x0c000001 size/type=2
        param_type=3
  [2] INT name_off=12 info=0x01000000 size/type=4
        desc=0x01000020
  [3] PTR name_off=0 info=0x02000000 size/type=4
  [4] STRUCT name_off=16 info=0x04000002 size/type=8
        name_off=18 type=2 bit_offset=0
        name_off=20 type=5 bit_offset=32
  [5] INT name_off=22 info=0x01000000 size/type=1
        desc=0x02000008

  String Table:
  0 :
  1 : test
  6 : .text
  12 : int
  16 : A
  18 : a
  20 : b
  22 : char
  27 : test.c
  34 : int test(struct A *t) {
  58 :   return t->a;

  FuncInfo Table:
  sec_name_off=6
        insn_offset=<Omitted> type_id=1

  LineInfo Table:
  sec_name_off=6
        insn_offset=<Omitted> file_name_off=27 line_off=34 line_num=6 column_num=0
        insn_offset=<Omitted> file_name_off=27 line_off=58 line_num=7 column_num=3
  -bash-4.2$ readelf -S test.o
  ......
    [12] .BTF              PROGBITS         0000000000000000  0000028d
       00000000000000c1  0000000000000000           0     0     1
    [13] .BTF.ext          PROGBITS         0000000000000000  0000034e
       0000000000000050  0000000000000000           0     0     1
    [14] .rel.BTF.ext      REL              0000000000000000  00000648
       0000000000000030  0000000000000010          16    13     8
  ......
  -bash-4.2$

The latest linux kernel ([6]) can already support .BTF with type information.
The [7] has the reference implementation in linux kernel side
to support .BTF.ext func_info. The .BTF.ext line_info support is not
implemented yet. If you have difficulty accessing [6], you can
manually do the following to access the code:

  git clone https://github.com/yonghong-song/bpf-next-linux.git
  cd bpf-next-linux
  git checkout btf

The change will push to linux kernel soon once this patch is landed.

References:
[1]. https://www.kernel.org/doc/Documentation/networking/filter.txt
[2]. https://lwn.net/Articles/750695/
[3]. https://github.com/iovisor/bcc
[4]. https://github.com/torvalds/linux/tree/master/tools/lib/bpf
[5]. https://github.com/torvalds/linux/tree/master/tools/bpf/bpftool
[6]. https://github.com/torvalds/linux
[7]. https://github.com/yonghong-song/bpf-next-linux/tree/btf

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>

Differential Revision: https://reviews.llvm.org/D52950

llvm-svn: 344366
2018-10-12 17:01:46 +00:00
Sanjay Patel e28c8ecd72 [x86] add and use fast horizontal vector math subtarget feature
This is the planned follow-up to D52997. Here we are reducing horizontal vector math codegen 
by default. AMD Jaguar (btver2) should have no difference with this patch because it has 
fast-hops. (If we want to set that bit for other CPUs, let me know.)

The code changes are small, but there are many test diffs. For files that are specifically 
testing for hops, I added RUNs to distinguish fast/slow, so we can see the consequences 
side-by-side. For files that are primarily concerned with codegen other than hops, I just 
updated the CHECK lines to reflect the new default codegen.

To recap the recent horizontal op story:

1. Before rL343727, we were producing hops for all subtargets for a variety of patterns. 
   Hops were likely not optimal for all targets though.
2. The IR improvement in r343727 exposed a hole in the backend hop pattern matching, so 
   we reduced hop codegen for all subtargets. That was bad for Jaguar (PR39195).
3. We restored the hop codegen for all targets with rL344141. Good for Jaguar, but 
   probably bad for other CPUs.
4. This patch allows us to distinguish when we want to produce hops, so everyone can be 
   happy. I'm not sure if we have the best predicate here, but the intent is to undo the 
   extra hop-iness that was enabled by r344141.

Differential Revision: https://reviews.llvm.org/D53095

llvm-svn: 344361
2018-10-12 16:41:02 +00:00
Nick Desaulniers 47bab69a2e [MC][ELF] fix newly added test
Summary:
Reland of
- r344197 "[MC][ELF] compute entity size for explicit sections"
- r344206 "[MC][ELF] Fix section_mergeable_size.ll"
after being reverted in r344278 due to build breakages from not
specifying a target triple.

Move test from test/CodeGen/Generic/ to test/MC/ELF/.
Add explicit target triple so we don't try to run
this test on non ELF targets.

Reported: https://reviews.llvm.org/D53056#1261707

Reviewers: fhahn, rnk, espindola, NoQ

Reviewed By: fhahn, rnk

Subscribers: NoQ, MaskRay, rengolin, emaste, arichardson, llvm-commits, pirama, srhines

Differential Revision: https://reviews.llvm.org/D53146

llvm-svn: 344360
2018-10-12 16:35:44 +00:00
Simon Pilgrim c046b6856e Pull out repeated value types. NFCI.
llvm-svn: 344355
2018-10-12 15:49:19 +00:00
Simon Pilgrim b926fd7b36 Pull out repeated value types. NFCI.
llvm-svn: 344354
2018-10-12 15:48:47 +00:00
Eric Liu 55ab86b72b Fix unused variable warning after r344348
llvm-svn: 344350
2018-10-12 15:01:11 +00:00
Simon Pilgrim b8339c0167 [SelectionDAG] Move VectorLegalizer::ExpandCTLZ codegen into SelectionDAGLegalize
Generalize SelectionDAGLegalize's CTLZ expansion to handle vectors - lets VectorLegalizer::ExpandCTLZ to just pass the expansion on instead of repeating the same codegen.

llvm-svn: 344349
2018-10-12 14:45:57 +00:00
Simon Pilgrim 78b5a3c3ef [X86][SSE] LowerVectorCTPOP - pull out repeated byte sum stage.
Pull out repeated byte sum stage for popcount of vector elements > 8bits.

This allows us to simplify the LUT/BITMATH popcnt code to always assume vXi8 vectors, and also improves avx512bitalg codegen which only has access to vpopcntb/vpopcntw.

llvm-svn: 344348
2018-10-12 14:18:47 +00:00
Hiroshi Inoue 9552dd187a [PowerPC] avoid masking already-zero bits in BitPermutationSelector
The current BitPermutationSelector generates a code to build a value by tracking two types of bits: ConstZero and Variable.
ConstZero means a bit we need to mask off and Variable is a bit we copy from an input value.

This patch add third type of bits VariableKnownToBeZero caused by AssertZext node or zero-extending load node.
VariableKnownToBeZero means a bit comes from an input value, but it is known to be already zero. So we do not need to mask them.
VariableKnownToBeZero enhances flexibility to group bits, since we can avoid redundant masking for these bits.

This patch also renames "HasZero" to "NeedMask" since now we may skip masking even when we have zeros (of type VariableKnownToBeZero).

Differential Revision: https://reviews.llvm.org/D48025

llvm-svn: 344347
2018-10-12 14:02:20 +00:00
Max Moroz 4d010ca35b [SanitizerCoverage] Make Inline8bit and TracePC counters dead stripping resistant.
Summary:
Otherwise, at least on Mac, the linker eliminates unused symbols which
causes libFuzzer to error out due to a mismatch of the sizes of coverage tables.

Issue in Chromium: https://bugs.chromium.org/p/chromium/issues/detail?id=892167

Reviewers: morehouse, kcc, george.karpenkov

Reviewed By: morehouse

Subscribers: kubamracek, llvm-commits

Differential Revision: https://reviews.llvm.org/D53113

llvm-svn: 344345
2018-10-12 13:59:31 +00:00
Simon Pilgrim 29279f29c8 [X86][SSE] Add extract_subvector(PSHUFB) -> PSHUFB(extract_subvector()) combine
Fixes PR32160 by reducing the size of PSHUFB if we only use one of the lanes.

This approach can probably be generalized to handle any target shuffle (and any subvector index) but we have no test coverage at the moment.

llvm-svn: 344336
2018-10-12 12:10:34 +00:00
Andrea Di Biagio 6eebbe0a97 [tblgen][llvm-mca] Add the ability to describe move elimination candidates via tablegen.
This patch adds the ability to identify instructions that are "move elimination
candidates". It also allows scheduling models to describe processor register
files that allow move elimination.

A move elimination candidate is an instruction that can be eliminated at
register renaming stage.
Each subtarget can specify which instructions are move elimination candidates
with the help of tablegen class "IsOptimizableRegisterMove" (see
llvm/Target/TargetInstrPredicate.td).

For example, on X86, BtVer2 allows both GPR and MMX/SSE moves to be eliminated.
The definition of 'IsOptimizableRegisterMove' for BtVer2 looks like this:

```
def : IsOptimizableRegisterMove<[
  InstructionEquivalenceClass<[
    // GPR variants.
    MOV32rr, MOV64rr,

    // MMX variants.
    MMX_MOVQ64rr,

    // SSE variants.
    MOVAPSrr, MOVUPSrr,
    MOVAPDrr, MOVUPDrr,
    MOVDQArr, MOVDQUrr,

    // AVX variants.
    VMOVAPSrr, VMOVUPSrr,
    VMOVAPDrr, VMOVUPDrr,
    VMOVDQArr, VMOVDQUrr
  ], CheckNot<CheckSameRegOperand<0, 1>> >
]>;
```

Definitions of IsOptimizableRegisterMove from processor models of a same
Target are processed by the SubtargetEmitter to auto-generate a target-specific
override for each of the following predicate methods:

```
bool TargetSubtargetInfo::isOptimizableRegisterMove(const MachineInstr *MI)
const;
bool MCInstrAnalysis::isOptimizableRegisterMove(const MCInst &MI, unsigned
CPUID) const;
```

By default, those methods return false (i.e. conservatively assume that there
are no move elimination candidates).

Tablegen class RegisterFile has been extended with the following information:
 - The set of register classes that allow move elimination.
 - Maxium number of moves that can be eliminated every cycle.
 - Whether move elimination is restricted to moves from registers that are
   known to be zero.

This patch is structured in three part:

A first part (which is mostly boilerplate) adds the new
'isOptimizableRegisterMove' target hooks, and extends existing register file
descriptors in MC by introducing new fields to describe properties related to
move elimination.

A second part, uses the new tablegen constructs to describe move elimination in
the BtVer2 scheduling model.

A third part, teaches llm-mca how to query the new 'isOptimizableRegisterMove'
hook to mark instructions that are candidates for move elimination. It also
teaches class RegisterFile how to describe constraints on move elimination at
PRF granularity.

llvm-mca tests for btver2 show differences before/after this patch.

Differential Revision: https://reviews.llvm.org/D53134

llvm-svn: 344334
2018-10-12 11:23:04 +00:00
Simon Pilgrim c844bc84dd [X86] Ignore float/double non-temporal loads (PR39256)
Scalar non-temporal loads were asserting instead of just being ignored.

Reduced from https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=10895

llvm-svn: 344331
2018-10-12 10:20:16 +00:00
Tim Northover 487780678f SCCP: avoid caching DenseMap entry that might be invalidated.
Later calls to getValueState might insert entries into the ValueState map and
cause reallocation, invalidating a reference.

llvm-svn: 344327
2018-10-12 09:01:59 +00:00
Stefan Maksimovic 285c0f4fdc [mips] Mark fmaxl as a long double emulation routine
Failure was discovered upon running
projects/compiler-rt/test/builtins/Unit/divtc3_test.c
in a stage2 compiler build.

When compiling projects/compiler-rt/lib/builtins/divtc3.c,
a call to fmaxl within the divtc3 implementation had its
return values read from registers $2 and $3 instead of $f0 and $f2.
Include fmaxl in the list of long double emulation routines
to have its return value correctly interpreted as f128.

Almost exact issue here: https://reviews.llvm.org/D17760

Differential Revision: https://reviews.llvm.org/D52649

llvm-svn: 344326
2018-10-12 08:18:38 +00:00
Eugene Leviant eddf6b5df5 [ThinLTO] Don't import GV which contains blockaddress
Differential revision: https://reviews.llvm.org/D53139

llvm-svn: 344325
2018-10-12 07:24:02 +00:00
Sanjay Patel 56b6660d2e [DAGCombiner] rearrange extract_element+bitcast fold; NFC
I want to add another pattern here that includes scalar_to_vector,
so this makes that patch smaller. I was hoping to remove the
hasOneUse() check because it shouldn't be necessary for common
codegen, but an AMDGPU test has a comment suggesting that the
extra check makes things better on one of those targets.

llvm-svn: 344320
2018-10-11 23:56:56 +00:00
Matthias Braun c7efb6f990 Revert "DwarfDebug: Pick next location in case of missing location at block begin"
It originally triggered a stepping problem in the debugger, which could
be fixed by adjusting CodeGen/LexicalScopes.cpp however it seems we prefer
the previous behavior anyway.

See the discussion for details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181008/593833.html

This reverts commit r343880.
This reverts commit r343874.

llvm-svn: 344318
2018-10-11 23:37:58 +00:00
Tom Stellard a894043910 Revert "AMDGPU/GlobalISel: Implement select for G_INSERT"
This reverts commit r344310.

The test case was failing on some bots.

llvm-svn: 344317
2018-10-11 23:36:46 +00:00
Matthias Braun d6131c9633 X86/TargetTransformInfo: Report div/rem constant immediate costs as TCC_Free
DIV/REM by constants should always be expanded into mul/shift/etc.
patterns. Unfortunately the ConstantHoisting pass runs too early at a
point where the pattern isn't expanded yet. However after
ConstantHoisting hoisted some immediate the result may not expand
anymore. Also the hoisting typically doesn't make sense because it
operates on immediates that will change completely during the expansion.

Report DIV/REM as TCC_Free so ConstantHoisting will not touch them.

Differential Revision: https://reviews.llvm.org/D53174

llvm-svn: 344315
2018-10-11 23:14:35 +00:00
Kostya Serebryany d891ac9794 merge two near-identical functions createPrivateGlobalForString into one
Summary:
We have two copies of createPrivateGlobalForString (in asan and in esan).
This change merges them into one. NFC

Reviewers: vitalybuka

Reviewed By: vitalybuka

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53178

llvm-svn: 344314
2018-10-11 23:03:27 +00:00
Tom Stellard 4733be6e7b AMDGPU/GlobalISel: Implement select for G_INSERT
Reviewers: arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D53116

llvm-svn: 344310
2018-10-11 22:49:54 +00:00
Ana Pazos 0a5fcefa31 [RISCV] Fix disassembling of fence instruction with invalid field
Summary:
Instruction with 0 in fence field being disassembled as fence , iorw.
Printing "unknown" to match GAS behavior.

This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer
for the RISC-V assembly language.

Reviewers: asb

Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, asb

Differential Revision: https://reviews.llvm.org/D51828

llvm-svn: 344309
2018-10-11 22:49:13 +00:00
Richard Trieu dfd1760b5f Inline variable into assert to avoid unused variable warning.
llvm-svn: 344308
2018-10-11 22:42:41 +00:00
Craig Topper 35d513c7e4 [X86] Type legalize v2f32 loads by using an f64 load and a scalar_to_vector.
On 64-bit targets the generic legalize will use an i64 load and a scalar_to_vector for us. But on 32-bit targets i64 isn't legal and the generic legalizer will end up emitting two 32-bit loads. We have DAG combines that try to put those two loads back together with pretty good success.

This patch instead uses f64 to avoid the splitting entirely. I've made it do the same for 64-bit mode for consistency and to keep the load in the fp domain.

There are a few things in here that look like regressions in 32-bit mode, but I believe they bring us closer to the 64-bit mode codegen. And that the 64-bit mode code could be better. I think those issues should be looked at separately.

Differential Revision: https://reviews.llvm.org/D52528

llvm-svn: 344291
2018-10-11 20:36:06 +00:00
Thomas Lively f04bed8e79 [WebAssembly][NFC] Remove repetition of Defs = [ARGUMENTS] (fixed)
llvm-svn: 344287
2018-10-11 20:21:22 +00:00
Sumanth Gundapaneni a4a9155e4f [Hexagon] Restrict compound instructions with constant value.
Having a constant value operand in the compound instruction
is not always profitable. This patch improves coremark by ~4% on
Hexagon.

Differential Revision: https://reviews.llvm.org/D53152

llvm-svn: 344284
2018-10-11 19:48:15 +00:00
Sumanth Gundapaneni 77418a3753 [Pipeliner] Use the Index from Topo instead of relying on NodeNum. (NFC)
In future, if we may add any new DAG mutations other than artificial dependencies,
the NodeNum may not be valid. Instead the index from topological schedule DAG can be
used as long as we update it with the DAG change.

Differential Revision: https://reviews.llvm.org/D53104

llvm-svn: 344283
2018-10-11 19:45:07 +00:00
Sumanth Gundapaneni 8916e438c4 [Pipeliner] Fix the Schedule DAG topoligical order.
This patch updates the DAG change to reflect in the topological ordering
of the nodes.

Differential Revision: https://reviews.llvm.org/D53105

llvm-svn: 344282
2018-10-11 19:42:46 +00:00
Thomas Lively ab37189f7e [WebAssembly] Revert rL344180, which was breaking expensive checks
llvm-svn: 344280
2018-10-11 18:45:48 +00:00
Zachary Turner e8a6c3eb96 Revert SymbolFileNativePDB plugin.
This was originally causing some test failures on non-Windows
platforms, which required fixes in the compiler and linker.  After
those fixes, however, other tests started failing.  Reverting
temporarily until I can address everything.

llvm-svn: 344279
2018-10-11 18:45:44 +00:00
Artem Dergachev 2ce1d6faf8 Revert r344197 "[MC][ELF] compute entity size for explicit sections"
Revert r344206 "[MC][ELF] Fix section_mergeable_size.ll"

They were causing failures on too many important buildbots for too long.
Please revert eagerly if your fix takes more than a couple of hours to land!

llvm-svn: 344278
2018-10-11 18:43:08 +00:00
Leonard Chan 64e21b5cfd [PassManager/Sanitizer] Port of AddresSanitizer pass from legacy to new PassManager
This patch ports the legacy pass manager to the new one to take advantage of
the benefits of the new PM. This involved moving a lot of the declarations for
`AddressSantizer` to a header so that it can be publicly used via
PassRegistry.def which I believe contains all the passes managed by the new PM.

This patch essentially decouples the instrumentation from the legacy PM such
hat it can be used by both legacy and new PM infrastructure.

Differential Revision: https://reviews.llvm.org/D52739

llvm-svn: 344274
2018-10-11 18:31:51 +00:00
Nirav Dave f1f2a2a31a [DAG] Fix Big Endian in Load-Store forwarding
Summary:
Correct offset calculation in load-store forwarding for big-endian
targets.

Reviewers: rnk, RKSimon, waltl

Subscribers: sdardis, nemanjai, hiraditya, jrtc27, atanasyan, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D53147

llvm-svn: 344272
2018-10-11 18:28:59 +00:00
Krzysztof Parzyszek 5d3a6f76a8 [Hexagon] Eliminate potential sources of non-determinism in HCE
Also, avoid comparing GUIDs when ordering global addresses, because
source file location can cause different GUID to be calculated. As a
result, a pair of symbols can compare "less" in one directory, but
"greater" in another.

llvm-svn: 344271
2018-10-11 18:26:02 +00:00
Craig Topper fb2ac8969e [X86] Restore X86ISelDAGToDAG::matchBEXTRFromAnd. Teach address matching to create a BEXTR pattern from a (shl (and X, mask >> C1) if C1 can be folded into addressing mode.
This is an alternative to D53080 since I think using a BEXTR for a shifted mask is definitely an improvement when the shl can be absorbed into addressing mode. The other cases I'm less sure about.

We already have several tricks for handling an and of a shift in address matching. This adds a new case for BEXTR.

I've moved the BEXTR matching code back to X86ISelDAGToDAG to allow it to match. I suppose alternatively we could directly emit a X86ISD::BEXTR node that isel could pattern match. But I'm trying to view BEXTR matching as an isel concern so DAG combine can see 'and' and 'shift' operations that are well understood. We did lose a couple cases from tbm_patterns.ll, but I think there are ways to recover that.

I've also put back the manual load folding code in matchBEXTRFromAnd that I removed a few months ago in r324939. This gives us some more freedom to make decisions based on the ability to fold a load. I haven't done anything with that yet.

Differential Revision: https://reviews.llvm.org/D53126

llvm-svn: 344270
2018-10-11 18:06:07 +00:00
Zachary Turner e502f8b315 Better support for POSIX paths in PDBs.
While it doesn't make a *ton* of sense for POSIX paths to be
in PDBs, it's possible to occur in real scenarios involving
cross compilation.

The tools need to be able to handle this, because certain types
of debugging scenarios are possible without a running process
and so don't necessarily require you to be on a Windows system.
These include post-mortem debugging and binary forensics (e.g.
using a debugger to disassemble functions and examine symbols
without running the process).

There's changes in clang, LLD, and lldb in this patch.  After
this the cross-platform disassembly and source-list tests pass
on Linux.

Furthermore, the behavior of LLD can now be summarized by a much
simpler rule than before: Unless you specify /pdbsourcepath and
/pdbaltpath, the PDB ends up with paths that are valid within
the context of the machine that the link is performed on.

Differential Revision: https://reviews.llvm.org/D53149

llvm-svn: 344269
2018-10-11 18:01:55 +00:00
Sanjay Patel 4875662e57 [DAGCombiner] move comment closer to the corresponding code; NFC
llvm-svn: 344255
2018-10-11 16:07:25 +00:00
Amara Emerson 54f60255a2 [InstCombine] Fix SimplifyLibCalls erasing an instruction while IC still had references to it.
InstCombine keeps a worklist and assumes that optimizations don't
eraseFromParent() the instruction, which SimplifyLibCalls violates. This change
adds a new callback to SimplifyLibCalls to let clients specify their own hander
for erasing actions.

Differential Revision: https://reviews.llvm.org/D52729

llvm-svn: 344251
2018-10-11 14:51:11 +00:00
Diogo N. Sampaio 352a2fa1e7 [AARCH64][FIX] Emit data symbol for constant pool data
The ARM64 elf emitter would omit printing data
symbol for zero filled constant data. This patch
overrides the emitFill method as to enforce that
the symbol is correctly printed.

Differential revision: https://reviews.llvm.org/D53132

llvm-svn: 344248
2018-10-11 14:10:32 +00:00
Dylan McKay e48f27a0b1 Generalize an IR verifier check to work with non-zero program address spaces
This commit modifies an existing IR verifier check that
assumes all functions will be located in the default address
space 0.

Rather than using the default paramater value getPointerTo(AddrSpace=0),
explicitly specify the program memory address space from the data layout.

This only affects targets that specify a nonzero address space
in their data layouts. The only in-tree target that does this
is AVR.

llvm-svn: 344243
2018-10-11 12:49:50 +00:00
David Green 8066198442 [InstCombine] Demand bits of UMin
This is the umin alternative to the umax code from rL344237. We use
DeMorgans law on the umax case to bring us to the same thing on umin,
but using countLeadingOnes, not countLeadingZeros.

Differential Revision: https://reviews.llvm.org/D53036

llvm-svn: 344239
2018-10-11 11:28:27 +00:00
David Green 30c0e98b9c [InstCombine] Demand bits of UMax
Use the demanded bits of umax(A,C) to prove we can just use A so long as the
lowest non-zero bit of DemandMask is higher than the highest non-zero bit of C

Differential Revision: https://reviews.llvm.org/D53033

llvm-svn: 344237
2018-10-11 11:04:09 +00:00
Florian Hahn 18e07bb822 [LV] Use SmallVector instead of DenseMap in calculateRegisterUsage (NFC).
We assign indices sequentially for seen instructions, so we can just use
a vector and push back the seen instructions. No need for using a
DenseMap.

Reviewers: hsaito, rengolin, nadav, dcaballe

Reviewed By: rengolin

Differential Revision: https://reviews.llvm.org/D53089

llvm-svn: 344233
2018-10-11 09:46:25 +00:00
Florian Hahn 7eb5cb4ebc [LV] Ignore more debug info.
We can avoid doing some unnecessary work by skipping debug instructions
in a few loops. It also helps to ensure debug instructions do not
prevent vectorization, although I do not have any concrete test cases
for that.

Reviewers: rengolin, hsaito, dcaballe, aprantl, vsk

Reviewed By: rengolin, dcaballe

Differential Revision: https://reviews.llvm.org/D53091

llvm-svn: 344232
2018-10-11 09:27:24 +00:00
Calixte Denizet d2f290b034 [gcov] Display the hit counter for the line of a function definition
Summary:
Right now there is no hit counter on the line of function.
So the idea is add the line of the function to all the lines covered by the entry block.
Tests in compiler-rt/profile will be fixed in another patch: https://reviews.llvm.org/D49854

Reviewers: marco-c, davidxl

Reviewed By: marco-c

Subscribers: sylvestre.ledru, llvm-commits

Differential Revision: https://reviews.llvm.org/D49853

llvm-svn: 344228
2018-10-11 08:53:43 +00:00
Max Kazantsev 5dbeff3e1c [NFC] Factor out getOrCreateAddRecExpr method
llvm-svn: 344227
2018-10-11 08:46:39 +00:00
Roman Lebedev 4225f4adff [X86][BMI1]: X86DAGToDAGISel: select BEXTR from x & ~(-1 << nbits) pattern
Summary:
As discussed in D48491, we can't really do this in the TableGen,
since we need to produce *two* instructions. This only implements
one single pattern. The other 3 patterns will be in follow-ups.

I'm not sure yet if we want to also fuse shift into here
(i.e `(x >> start) & ...`)

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D52304

llvm-svn: 344224
2018-10-11 07:51:13 +00:00
Max Kazantsev b2e51090a4 [IndVars] Drop "exact" flag from lshr and udiv when substituting their args
There is a transform that may replace `lshr (x+1), 1` with `lshr x, 1` in case
if it can prove that the result will be the same. However the initial instruction
might have an `exact` flag set, and it now should be dropped unless we prove
that it may hold. Incorrectly set `exact` attribute may then produce poison.

Differential Revision: https://reviews.llvm.org/D53061
Reviewed By: sanjoy

llvm-svn: 344223
2018-10-11 07:22:26 +00:00
Thomas Lively 7fa7e6a284 [WebAssembly][NFC] Use intrinsic dag nodes directly
Summary: Instead of custom lowering to WebAssemblyISD nodes first.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53119

llvm-svn: 344211
2018-10-11 00:49:24 +00:00
Thomas Lively 2ebacb107b [WebAssembly] Saturating float to int intrinsics
Summary:
Although the saturating float to int instructions are already
emitted from normal IR, the fpto{s,u}i instructions produce poison
values if the argument cannot fit in the result type. These intrinsics
are therefore necessary to get guaranteed defined saturating behavior.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53004

llvm-svn: 344204
2018-10-11 00:01:25 +00:00
Saleem Abdulrasool 0d1cbcc3eb llvm-c: Add C APIs to access DebugLoc info
Add thin shims to C interface to provide access to DebugLoc info for
Instructions, GlobalVariables and Functions.  Patch by Josh Berdine!

llvm-svn: 344202
2018-10-10 23:53:12 +00:00
Richard Smith 6c67662816 Add a flag to remap manglings when reading profile data information.
This can be used to preserve profiling information across codebase
changes that have widespread impact on mangled names, but across which
most profiling data should still be usable. For example, when switching
from libstdc++ to libc++, or from the old libstdc++ ABI to the new ABI,
or even from a 32-bit to a 64-bit build.

The user can provide a remapping file specifying parts of mangled names
that should be treated as equivalent (eg, std::__1 should be treated as
equivalent to std::__cxx11), and profile data will be treated as
applying to a particular function if its name is equivalent to the name
of a function in the profile data under the provided equivalences. See
the documentation change for a description of how this is configured.

Remapping is supported for both sample-based profiling and instruction
profiling. We do not support remapping indirect branch target
information, but all other profile data should be remapped
appropriately.

Support is only added for the new pass manager. If someone wants to also
add support for this for the old pass manager, doing so should be
straightforward.

This is the LLVM side of Clang r344199.

Reviewers: davidxl, tejohnson, dlj, erik.pilkington

Subscribers: mehdi_amini, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51249

llvm-svn: 344200
2018-10-10 23:13:47 +00:00
Warren Ristow febfc4e89b [LTO] Account for overriding lib calls via the alias attribute
Given a library call that is represented as an llvm intrinsic call, but
later transformed to an actual call, if an overriding definition of that
library routine is provided indirectly via an alias, prevent LTO from
eliminating the definition.

This is a fix for PR38547.

Differential Revision: https://reviews.llvm.org/D52836

llvm-svn: 344198
2018-10-10 22:54:31 +00:00
Nick Desaulniers 335315697a [MC][ELF] compute entity size for explicit sections
Summary:
Global variables might declare themselves to be in explicit sections.
Calculate the entity size always to prevent assembler warnings
"entity size for SHF_MERGE not specified" when sections are to be
marked merge-able.

Fixes PR31828.

Reviewers: rnk, echristo

Reviewed By: rnk

Subscribers: llvm-commits, pirama, srhines

Differential Revision: https://reviews.llvm.org/D53056

llvm-svn: 344197
2018-10-10 22:52:32 +00:00
Craig Topper b5421c498d [X86] Prevent non-temporal loads from folding into instructions by blocking them in X86DAGToDAGISel::IsProfitableToFold rather than with a predicate.
Remove tryFoldVecLoad since tryFoldLoad would call IsProfitableToFold and pick up the new check.

This saves about 5K out of ~600K on the generated isel table.

llvm-svn: 344189
2018-10-10 21:48:34 +00:00
Richard Smith 2843635829 Support for remapping profile data when symbols change, for sample-based
profiling.

Reviewers: davidxl, tejohnson, dlj, erik.pilkington

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51248

llvm-svn: 344187
2018-10-10 21:31:01 +00:00
George Burgess IV 6ef8002c2c Replace most users of UnknownSize with LocationSize::unknown(); NFC
Moving away from UnknownSize is part of the effort to migrate us to
LocationSizes (e.g. the cleanup promised in D44748).

This doesn't entirely remove all of the uses of UnknownSize; some uses
require tweaks to assume that UnknownSize isn't just some kind of int.
This patch is intended to just be a trivial replacement for all places
where LocationSize::unknown() will Just Work.

llvm-svn: 344186
2018-10-10 21:28:44 +00:00
Richard Smith ceed4eb13d Support for remapping profile data when symbols change, for
instrumentation-based profiling.

Reviewers: davidxl, tejohnson, dlj, erik.pilkington

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51247

llvm-svn: 344184
2018-10-10 21:09:37 +00:00
James Y Knight c0b28d55a7 llvm-ar: Darwin archive format fixes.
* Support writing the DARWIN64 symbol table format.

* In darwin archives, emit a symbol table whenever requested, even
  when there are no members, as the apple linker will abort if given
  an archive without a symbol table.

Added tests for same, and also simplified and moved the GNU 64-bit
symbol table test into archive-symtab.test.

llvm-svn: 344183
2018-10-10 21:07:02 +00:00
Sanjay Patel 05aadf885d [InstCombine] reverse 'trunc X to <N x i1>' canonicalization; 2nd try
Re-trying r344082 because it unintentionally included extra diffs.

Original commit message:
icmp ne (and X, 1), 0 --> trunc X to N x i1

Ideally, we'd do the same for scalars, but there will likely be
regressions unless we add more trunc folds as we're doing here
for vectors.

The motivating vector case is from PR37549:
https://bugs.llvm.org/show_bug.cgi?id=37549

define <4 x float> @bitwise_select(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x float> %w) {

  %c = fcmp ole <4 x float> %x, %y
  %s = sext <4 x i1> %c to <4 x i32>
  %s1 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 0, i32 0, i32 1, i32 1>
  %s2 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3>
  %cond = or <4 x i32> %s1, %s2
  %condtr = trunc <4 x i32> %cond to <4 x i1>
  %r = select <4 x i1> %condtr, <4 x float> %z, <4 x float> %w
  ret <4 x float> %r

}

Here's a sampling of the vector codegen for that case using
mask+icmp (current behavior) vs. trunc (with this patch):

AVX before:

vcmpleps        %xmm1, %xmm0, %xmm0
vpermilps       $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps       $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps   %xmm0, %xmm1, %xmm0
vandps  LCPI0_0(%rip), %xmm0, %xmm0
vxorps  %xmm1, %xmm1, %xmm1
vpcmpeqd        %xmm1, %xmm0, %xmm0
vblendvps       %xmm0, %xmm3, %xmm2, %xmm0

AVX after:

vcmpleps        %xmm1, %xmm0, %xmm0
vpermilps       $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps       $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps   %xmm0, %xmm1, %xmm0
vblendvps       %xmm0, %xmm2, %xmm3, %xmm0

AVX512f before:

vcmpleps        %xmm1, %xmm0, %xmm0
vpermilps       $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps       $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps   %xmm0, %xmm1, %xmm0
vpbroadcastd    LCPI0_0(%rip), %xmm1 ## xmm1 = [1,1,1,1]
vptestnmd       %zmm1, %zmm0, %k1
vblendmps       %zmm3, %zmm2, %zmm0 {%k1}

AVX512f after:

vcmpleps        %xmm1, %xmm0, %xmm0
vpermilps       $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps       $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps   %xmm0, %xmm1, %xmm0
vpslld  $31, %xmm0, %xmm0
vptestmd        %zmm0, %zmm0, %k1
vblendmps       %zmm2, %zmm3, %zmm0 {%k1}

AArch64 before:

fcmge   v0.4s, v1.4s, v0.4s
zip1    v1.4s, v0.4s, v0.4s
zip2    v0.4s, v0.4s, v0.4s
orr     v0.16b, v1.16b, v0.16b
movi    v1.4s, #1
and     v0.16b, v0.16b, v1.16b
cmeq    v0.4s, v0.4s, #0
bsl     v0.16b, v3.16b, v2.16b

AArch64 after:

fcmge   v0.4s, v1.4s, v0.4s
zip1    v1.4s, v0.4s, v0.4s
zip2    v0.4s, v0.4s, v0.4s
orr     v0.16b, v1.16b, v0.16b
bsl     v0.16b, v2.16b, v3.16b

PowerPC-le before:

xvcmpgesp 34, 35, 34
vspltisw 0, 1
vmrglw 3, 2, 2
vmrghw 2, 2, 2
xxlor 0, 35, 34
xxlxor 35, 35, 35
xxland 34, 0, 32
vcmpequw 2, 2, 3
xxsel 34, 36, 37, 34

PowerPC-le after:

xvcmpgesp 34, 35, 34
vmrglw 3, 2, 2
vmrghw 2, 2, 2
xxlor 0, 35, 34
xxsel 34, 37, 36, 0

Differential Revision: https://reviews.llvm.org/D52747

llvm-svn: 344181
2018-10-10 20:47:46 +00:00
Thomas Lively eff0542c56 [WebAssembly][NFC] Remove repetition of Defs = [ARGUMENTS]
Summary:
By moving that line into the `I` multiclass.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53093

llvm-svn: 344180
2018-10-10 20:40:54 +00:00
Roman Lebedev 33d84c6dac [X86] Move X86DAGToDAGISel::matchBEXTRFromAnd() into X86ISelLowering
Summary:
As discussed in [[ https://bugs.llvm.org/show_bug.cgi?id=38938 | PR38938 ]],
we fail to emit `BEXTR` if the mask is shifted.
We can't deal with that in `X86DAGToDAGISel` `before the address mode for the inc is selected`,
and we can't really do it in the normal DAGCombine, because we don't have generic `ISD::BitFieldExtract` node,
and if we simply turn the shifted mask into a normal mask + shift-left, it will be folded back.
So it would seem X86ISelLowering is the place to handle this.

This patch only moves the matchBEXTRFromAnd()
from X86DAGToDAGISel to X86ISelLowering.
It does not add support for the 'shifted mask' pattern.

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52426

llvm-svn: 344179
2018-10-10 20:40:12 +00:00
Sanjay Patel 58fc00d0bc revert r344082: [InstCombine] reverse 'trunc X to <N x i1>' canonicalization
This commit accidentally included the diffs from D53057.

llvm-svn: 344178
2018-10-10 20:39:39 +00:00
David Bolvansky 7e30c91dca [DwarfVerifier] Fixed -Wimplicit-fallthrough warning
Reviewers: JDevlieghere, RKSimon

Reviewed By: JDevlieghere

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52963

llvm-svn: 344176
2018-10-10 20:10:37 +00:00
Thomas Lively 103f0161b3 [WebAssembly][NFC] Use vnot patfrag to simplify v128.not
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53097

llvm-svn: 344175
2018-10-10 19:09:16 +00:00
Renato Golin cb19c8e3aa [LV] Add a new reduction pattern match
Adding a new reduction pattern match for vectorizing code similar to TSVC s3111:

for (int i = 0; i < N; i++)
  if (a[i] > b)
    sum += a[i];

This patch adds support for fadd, fsub and fmull, as well as multiple
branches and different (but compatible) instructions (ex. add+sub) in
different branches.

I have forwarded to trunk, added fsub and fmul functionality and
additional tests, but the credit goes to Takahiro, who did most of the
actual work.

Differential Revision: https://reviews.llvm.org/D49168

Patch by Takahiro Miyoshi <takahiro.miyoshi@linaro.org>.

llvm-svn: 344172
2018-10-10 18:49:49 +00:00
Francis Visoiu Mistrih 2e76cab47f Reland: [OptRemarks] Add library for parsing optimization remarks
Add a library that parses optimization remarks (currently YAML, so based
on the YAMLParser).

The goal is to be able to provide tools a remark parser that is not
completely dependent on YAML, in case we decide to change the format
later.

It exposes a C API which takes a handler that is called with the remark
structure.

It adds a libLLVMOptRemark.a static library, and it's used in-tree by
the llvm-opt-report tool (from which the parser has been mostly moved
out).

Differential Revision: https://reviews.llvm.org/D52776

Fixed the tests by removing the usage of C++11 strings, which seems not
to be supported by gcc 4.8.4 if they're used as a macro argument.

llvm-svn: 344171
2018-10-10 18:43:42 +00:00
Scott Linder ad115b7832 [Support] Remove redundant qualifiers in YAMLTraits (NFC)
llvm-svn: 344166
2018-10-10 18:14:02 +00:00
Francis Visoiu Mistrih 7839331ae9 Revert "[OptRemarks] Add library for parsing optimization remarks"
This reverts commit 1cc98e6672b6319fdb00b70dd4474aabdadbe193.

Seems to break bots: http://lab.llvm.org:8011/builders/clang-x86_64-linux-abi-test/builds/33398/steps/build-unified-tree/logs/stdio

llvm-svn: 344164
2018-10-10 18:07:44 +00:00
Francis Visoiu Mistrih 057784a263 [OptRemarks] Add library for parsing optimization remarks
Add a library that parses optimization remarks (currently YAML, so based
on the YAMLParser).

The goal is to be able to provide tools a remark parser that is not
completely dependent on YAML, in case we decide to change the format
later.

It exposes a C API which takes a handler that is called with the remark
structure.

It adds a libLLVMOptRemark.a static library, and it's used in-tree by
the llvm-opt-report tool (from which the parser has been mostly moved
out).

Differential Revision: https://reviews.llvm.org/D52776

llvm-svn: 344162
2018-10-10 17:58:09 +00:00
Renato Golin d8e7ca4a32 [VPlan] Fix CondBit quoting in dumpBasicBlock
Quotes were being printed for VPInstructions but not the rest.

llvm-svn: 344161
2018-10-10 17:55:21 +00:00
Scott Linder 3759efc650 Relax trivial cast requirements in CallPromotionUtils
Differential Revision: https://reviews.llvm.org/D52792

llvm-svn: 344153
2018-10-10 16:35:47 +00:00
Nirav Dave 07acc992dc [DAGCombine] Improve Load-Store Forwarding
Summary:
Extend analysis forwarding loads from preceeding stores to work with
extended loads and truncated stores to the same address so long as the
load is fully subsumed by the store.

Hexagon's swp-epilog-phis.ll and swp-memrefs-epilog1.ll test are
deleted as they've no longer seem to be relevant.

Reviewers: RKSimon, rnk, kparzysz, javed.absar

Subscribers: sdardis, nemanjai, hiraditya, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D49200

llvm-svn: 344142
2018-10-10 14:15:52 +00:00
Sanjay Patel 6cca8af227 [x86] allow single source horizontal op matching (PR39195)
This is intended to restore horizontal codegen to what it looked like before IR demanded elements improved in:
rL343727

As noted in PR39195:
https://bugs.llvm.org/show_bug.cgi?id=39195
...horizontal ops can be worse for performance than a shuffle+regular binop, so I've added a TODO. Ideally, we'd 
solve that in a machine instruction pass, but a quicker solution will be adding a 'HasFastHorizontalOp' feature
bit to deal with it here in the DAG.

Differential Revision: https://reviews.llvm.org/D52997

llvm-svn: 344141
2018-10-10 13:39:59 +00:00
Jonas Devlieghere fc51490baf Lift VFS from clang to llvm (NFC)
This patch moves the virtual file system form clang to llvm so it can be
used by more projects.

Concretely the patch:
 - Moves VirtualFileSystem.{h|cpp} from clang/Basic to llvm/Support.
 - Moves the corresponding unit test from clang to llvm.
 - Moves the vfs namespace from clang::vfs to llvm::vfs.
 - Formats the lines affected by this change, mostly this is the result of
   the added llvm namespace.

RFC on the mailing list:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/126657.html

Differential revision: https://reviews.llvm.org/D52783

llvm-svn: 344140
2018-10-10 13:27:25 +00:00
Simon Pilgrim bc7b6251b6 [TargetLowering] SimplifyDemandedBits - rename demanded mask args. NFCI.
Help stop bugs like rL343935 by making the 'original' DemandedBits arg more obviously not the mask that is actually used.

llvm-svn: 344138
2018-10-10 13:00:49 +00:00
Simon Pilgrim 53a503f6ac [TargetLowering] SimplifyDemandedBits - pull out repeated getOperands. NFCI.
Part of a minor cleanup to make all the switch statements more consistent prior to improving vector support.

llvm-svn: 344136
2018-10-10 12:32:13 +00:00
Carlos Alberto Enciso c0952c8a08 Revert "[DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG."
This reverts commit r344120.

It was causing buildbot failures.

llvm-svn: 344135
2018-10-10 12:09:34 +00:00
Valery Pykhtin 52391ea0b3 [TableGen] fix assert in !cast when used out of definition in a multiclass
Differential Revision: https://reviews.llvm.org/D53068

llvm-svn: 344134
2018-10-10 10:52:57 +00:00
Simon Pilgrim 5cb3a82892 [TargetLowering] Add root node back to work list after successful SimplifyDemandedBits/SimplifyDemandedVectorElts
Similar to what already happens in the DAGCombiner wrappers, this patch adds the root nodes back onto the worklist if the DCI wrappers' SimplifyDemandedBits/SimplifyDemandedVectorElts were successful.

Differential Revision: https://reviews.llvm.org/D53026

llvm-svn: 344132
2018-10-10 10:44:15 +00:00
Jonas Paulsson bf66f38705 [SystemZ] Temporarily disable high VFs with integer div/rem.
Until mischeduler is clever enough to avoid spilling in a vectorized loop
with many (scalar) DLRs it is better to avoid high vectorization factors (8
and above).

llvm-svn: 344129
2018-10-10 09:30:29 +00:00
Neil Henning 3d4579829e Fix an ordering bug in the scalarizer.
I've added a new test case that causes the scalarizer to try and use
dead-and-erased values - caused by the basic blocks not being in
domination order within the function. To fix this, instead of iterating
through the blocks in function order, I walk them in reverse post order.

Differential Revision: https://reviews.llvm.org/D52540

llvm-svn: 344128
2018-10-10 09:27:45 +00:00
Carlos Alberto Enciso e7a347e5f8 [DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG.
When SimplifyCFG changes the PHI node into a select instruction, the debug line records becomes ambiguous. It causes the debugger to display unreachable source lines. 

Differential Revision: https://reviews.llvm.org/D52887

llvm-svn: 344120
2018-10-10 08:29:55 +00:00
Craig Topper 02c62aa58a [X86] Remove FeatureRTM from Skylake processor list
Summary:
There are a LOT of Skylakes and later without TSX-NI. Examples:
- SKL: https://ark.intel.com/products/136863/Intel-Core-i3-8121U-Processor-4M-Cache-up-to-3-20-GHz-
- KBL: https://ark.intel.com/products/97540/Intel-Core-i7-7560U-Processor-4M-Cache-up-to-3-80-GHz-
- KBL-R: https://ark.intel.com/products/149091/Intel-Core-i7-8565U-Processor-8M-Cache-up-to-4-60-GHz-
- CNL: https://ark.intel.com/products/136863/Intel-Core-i3-8121U-Processor-4M-Cache-up-to-3_20-GHz

This feature seems to be present only on high-end desktop and server
chips (I can't find any SKX without). This commit leaves it disabled
for all processors, but can be re-enabled for specific builds with
-mrtm.

Patch by Thiago Macieira

Reviewers: erichkeane, craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53041

llvm-svn: 344116
2018-10-10 07:43:35 +00:00
Jonas Paulsson 2c8b33770c [SystemZ] Take better care when computing needed vector registers in TTI.
A new function getNumVectorRegs() is better to use for the number of needed
vector registers instead of getNumberOfParts(). This is to make sure that the
number of vector registers (and typically operations) required for a vector
type is accurate.

getNumberOfParts() which was previously used works by splitting the vector
type until it is legal gives incorrect results for types with a non
power of two number of elements (rare).

A new static function getScalarSizeInBits() that also checks for a pointer
type and returns 64U for it since otherwise it gets a value of 0). Used in a
few places where Ty may be pointer.

Review: Ulrich Weigand
llvm-svn: 344115
2018-10-10 07:36:27 +00:00
George Burgess IV 40dc63e1f0 [Analysis] Make LocationSizes carry an 'imprecise' bit
There are places where we need to merge multiple LocationSizes of
different sizes into one, and get a sensible result.

There are other places where we want to optimize aggressively based on
the value of a LocationSizes (e.g. how can a store of four bytes be to
an area of storage that's only two bytes large?)

This patch makes LocationSize hold an 'imprecise' bit to note whether
the LocationSize can be treated as an upper-bound and lower-bound for
the size of a location, or just an upper-bound.

This concludes the series of patches leading up to this. The most recent
of which is r344108.

Fixes PR36228.

Differential Revision: https://reviews.llvm.org/D44748

llvm-svn: 344114
2018-10-10 06:39:40 +00:00
Max Kazantsev 1d893bfcea [NFC] Make a variable const
llvm-svn: 344113
2018-10-10 04:19:38 +00:00
QingShan Zhang bc1586352e [PowerPC] Fix the assert of ISD::SIGN_EXTEND_INREG when type is v2i16 and v2i8
For ISD::SIGN_EXTEND_INREG operation of v2i16 and v2i8 types will cause assert because they are registered as custom operation. 
So that the type legalization phase will enter the custom hook, which do not handle ISD::SIGN_EXTEND_INREG operation and fall throw into unreachable assert.

Patch By: wuzish (Zixuan Wu)
Differential Revision: https://reviews.llvm.org/D52449

llvm-svn: 344109
2018-10-10 02:33:48 +00:00
George Burgess IV d98d505c0d [Analysis] Make LocationSize pretty-printing more descriptive
This is the third patch in a series intended to make
https://reviews.llvm.org/D44748 more easily reviewable. Please see that
patch for more context. The second being r344013.

The intent is to make the output of printing a LocationSize more
precise. The main motivation for this is that we plan to add a bit to
distinguish whether a given LocationSize is an upper-bound or is
precise; making that information available in pretty-printing is nice.

llvm-svn: 344108
2018-10-10 01:35:22 +00:00
Thomas Lively 108e98ec32 [WebAssembly] Fix fneg lowering
Summary:
Subtraction from zero and floating point negation do not have the same
semantics, so fix lowering.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52948

llvm-svn: 344107
2018-10-10 01:09:09 +00:00
Heejin Ahn 5d900954bd [WebAssembly] Improve comments for SIMD instruction definitions
llvm-svn: 344106
2018-10-10 01:04:02 +00:00
Fangrui Song 88478bbc60 [opt] Change the parameter of OptTable::PrintHelp from Name to Usage and don't append "[options] <inputs>"
Summary:
Before, "[options] <inputs>" is unconditionally appended to the `Name` parameter. It is more flexible to change its semantic to `Usage` and let user customize the usage line.

% llvm-objcopy
...
USAGE: llvm-objcopy <input> [ <output> ] [options] <inputs>

With this patch:

% llvm-objcopy
...
USAGE: llvm-objcopy input [output]

Reviewers: rupprecht, alexshap, jhenderson

Reviewed By: rupprecht

Subscribers: jakehehrlich, mehdi_amini, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51009

llvm-svn: 344097
2018-10-10 00:15:31 +00:00
Thomas Lively 409f5840a7 [WebAssembly] Handle V128 register class in explicit locals pass
Summary:
Also add tests to catch crashes in passes that are not normally run in
tests.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52959

llvm-svn: 344094
2018-10-09 23:33:16 +00:00
Nemanja Ivanovic 72d4866e57 [DAGCombiner] Expand combining of FP logical ops to sign-setting FP ops
We already do the following combines:
(bitcast int (and (bitcast fp X to int), 0x7fff...) to fp) -> fabs X
(bitcast int (xor (bitcast fp X to int), 0x8000...) to fp) -> fneg X

When the target has "bit preserving fp logic". This patch just extends it
to also combine:
(bitcast int (or (bitcast fp X to int), 0x8000...) to fp) -> fneg (fabs X)

As some targets have fnabs and even those that don't can efficiently lower
both the fabs and the fneg.

Differential revision: https://reviews.llvm.org/D44548

llvm-svn: 344093
2018-10-09 23:20:11 +00:00
Rong Xu 5c7bf1a756 [X86] Fix sanitizer bot failure from 344085
Fix the memory issue exposed by sanitizer.

llvm-svn: 344092
2018-10-09 23:10:56 +00:00
Heejin Ahn d9a6de3c38 [WebAssembly] Improve readability of SIMD instructions (NFC)
Summary:
- Categorize instructions into the categories as in the SIMD spec
- Move SIMD-related definition to WebAssemblyInstrSIMD.td
- Put definition and use of patterns together
- Add newlines here and there

Reviewers: tlively

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53045

llvm-svn: 344086
2018-10-09 22:23:39 +00:00