Commit Graph

171362 Commits

Author SHA1 Message Date
Sanjay Patel fa1c0fe478 [x86] try to form broadcast before widening shuffle elements
I noticed that we weren't generating broadcasts as much I thought we would with 
D54271, and this is part of the problem.

Widening the shuffle elements means adding bitcasts and hiding the relationship 
between a splatted scalar and the vector. If we can form a broadcast, do that 
before going through the rest of the shuffle lowering because broadcasts should 
be cheap and can often be load-folded.

Differential Revision: https://reviews.llvm.org/D54280

llvm-svn: 346498
2018-11-09 14:54:58 +00:00
Alex Bradbury 1cc2d0b9fb [RISCV] Avoid unnecessary XOR for seteq/setne 0
Differential Revision: https://reviews.llvm.org/D53492

Patch by James Clarke.

llvm-svn: 346497
2018-11-09 14:47:36 +00:00
Alex Bradbury a9da1de5f4 [RISCV] Update test/CodeGen/RISCV/calling-conv.ll after rL346432
The DAGCombiner changes led to a different schedule.

llvm-svn: 346496
2018-11-09 14:35:44 +00:00
Petar Avramovic 2cefaa2747 [MIPS GlobalISel] narrowScalar G_CONSTANT
Legalize s64 G_CONSTANT using narrowScalar on MIPS 32.

Differential Revision: https://reviews.llvm.org/D54255

llvm-svn: 346495
2018-11-09 14:21:16 +00:00
Krzysztof Parzyszek f740fd647a [Hexagon] Handle Hexagon's SHF_HEX_GPREL section flag
llvm-svn: 346494
2018-11-09 14:17:27 +00:00
Clement Courbet df78bf1452 [llvm-exegesis] Fix unit tests on PowerPC/AArch64.
We were comparing char*s and not contents. Introduced in rL346489.

llvm-svn: 346493
2018-11-09 14:08:29 +00:00
Florian Hahn 9f878e9bae Revert r346483: [CallSiteSplitting] Only record conditions up to the IDom(call site).
This cause a failure with EXPENSIVE_CHECKS

llvm-svn: 346492
2018-11-09 13:28:58 +00:00
Simon Pilgrim ea51f98b9b [X86] Add Subtarget to more lowerVectorShuffle functions. NFCI.
This will be necessary for an update to D54267

llvm-svn: 346490
2018-11-09 13:19:03 +00:00
Clement Courbet eee2e06e2a [llvm-exegesis][NFC] Add a way to declare the default counter binding for unbound CPUs for a target.
Summary:
This simplifies the code and moves everything to tablegen for consistency. This
also prepares the ground for adding issue counters.

Reviewers: gchatelet, john.brawn, jsji

Subscribers: nemanjai, mgorny, javed.absar, kbarton, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54297

llvm-svn: 346489
2018-11-09 13:15:32 +00:00
Andrea Di Biagio dffec12f33 [llvm-mca] Use a small vector for instructions in the EntryStage.
Use a simple SmallVector to track the lifetime of simulated instructions.
An ordered map was not needed because instructions are already picked in program
order. It is also much faster if we avoid searching for already retired
instructions at the end of every cycle.
The new policy only triggers a "garbage collection" when the number of retired
instructions becomes significantly big when compared with the total size of the
vector.

While working on this, I noticed that instructions were correctly retired, but
their internal state was not updated (i.e. there was no transition from the
EXECUTED state, to the RETIRED state). While this was not a problem for the
views, it prevented the EntryStage from correctly garbage collecting already
retired instructions. That was a bad oversight, and this patch fixes it.

The observed speedup on a debug build of llvm-mca after this patch is ~6%.
On a release build of llvm-mca, the observed speedup is ~%15%.

llvm-svn: 346487
2018-11-09 12:29:57 +00:00
Florian Hahn a1062f4b68 [IPSCCP,PM] Preserve DT in the new pass manager.
After D45330, Dominators are required for IPSCCP and can be preserved.

This patch preserves DominatorTreeAnalysis in the new pass manager. AFAIK the legacy pass manager cannot preserve function analysis required by a module analysis.

Reviewers: davide, dberlin, chandlerc, efriedma, kuhar, NutshellySima

Reviewed By: chandlerc, kuhar, NutshellySima

Differential Revision: https://reviews.llvm.org/D47259

llvm-svn: 346486
2018-11-09 11:52:27 +00:00
Alexandros Lamprineas e15c982f6d [SelectionDAG] swap select_cc operands to enable folding
The DAGCombiner tries to SimplifySelectCC as follows:

  select_cc(x, y, 16, 0, cc) -> shl(zext(set_cc(x, y, cc)), 4)

It can't cope with the situation of reordered operands:

  select_cc(x, y, 0, 16, cc)

In that case we just need to swap the operands and invert the Condition Code:

  select_cc(x, y, 16, 0, ~cc)

Differential Revision: https://reviews.llvm.org/D53236

llvm-svn: 346484
2018-11-09 11:09:40 +00:00
Florian Hahn 52578f95c9 [CallSiteSplitting] Only record conditions up to the IDom(call site).
We can stop recording conditions once we reached the immediate dominator
for the block containing the call site. Conditions in predecessors of the
that node will be the same for all paths to the call site and splitting
is not beneficial.

This patch makes CallSiteSplitting dependent on the DT anlysis. because
the immediate dominators seem to be the easiest way of finding the node
to stop at.

I had to update some exiting tests, because they were checking for
conditions that were true/false on all paths to the call site. Those
should now be handled by instcombine/ipsccp.

Reviewers: davide, junbuml

Reviewed By: junbuml

Differential Revision: https://reviews.llvm.org/D44627

llvm-svn: 346483
2018-11-09 10:23:46 +00:00
Clement Courbet e6b727e552 [X86] Fix VZEROUPPER scheduling info on SNB,HSW,BDW,SXL,SKX.
Summary:
Starting from SNB, VZEROUPPER is handled by the renamer and uses no proc resources.
After HSW, it also has zero latency.

This fixes PR35606.

To reproduce:
Uops:
  llvm-exegesis -mode=uops -opcode-name=VZEROUPPER
Latency:
  echo -e '#LLVM-EXEGESIS-DEFREG XMM0 1\n#LLVM-EXEGESIS-DEFREG XMM1 1\nvzeroupper' | /tmp/llvm-exegesis -mode=latency -snippets-file=-
  echo -e '#LLVM-EXEGESIS-DEFREG XMM0 1\n#LLVM-EXEGESIS-DEFREG XMM1 1\nvzeroupper\naddps %xmm0, %xmm1' | /tmp/llvm-exegesis -mode=latency -snippets-file=-

Reviewers: RKSimon, craig.topper, andreadb

Subscribers: gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D54107

llvm-svn: 346482
2018-11-09 09:49:06 +00:00
Carlos Alberto Enciso fa9cf89734 [DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG.
In SimplifyCFG when given a conditional branch that goes to BB1 and BB2, the hoisted common terminator instruction in the two blocks, caused debug line records associated with subsequent select instructions to become ambiguous. It causes the debugger to display unreachable source lines.

Differential Revision: https://reviews.llvm.org/D53390

llvm-svn: 346481
2018-11-09 09:42:10 +00:00
Sam Parker 08979cd125 [ARM] Enable mixed types in ARM CGP
Previously, during the search, all values had to have the same
'TypeSize', which is equal to number of bits of the integer type of
the icmp operand. All values in the tree had to match this size;
meaning that, if we searched from i16, we wouldn't accept i8s. A
change in type size requires zext and truncs to perform the casts so,
to allow mixed narrow types, the handling of these instructions is
now slightly different:

- we allow casts if their result or operand is <= TypeSize.
- zexts are sinks if their result > TypeSize.
- truncs are still sinks if their operand == TypeSize.
- truncs are still sources if their result == TypeSize.

The transformation bails on finding an icmp that operates on data
smaller than the current TypeSize.

Differential Revision: https://reviews.llvm.org/D54108

llvm-svn: 346480
2018-11-09 09:28:27 +00:00
Sam Parker 453ba916a0 [ARM] Small reorganisation in ARMParallelDSP
A few code movement things:

- AreSymmetrical is now a method of BinOpChain.
- Created a lambda in CreateParallelMACPairs to reduce loop nesting.
- A Reduction object now gets pasted in a couple of places instead,
  including CreateParallelMACPairs so it doesn't need to return a
  value.
I've also added RecordSequentialLoads, which is run before the
transformation begins, and caches the interesting loads. This can then
be queried later instead of cross checking many load values.

Differential Revision: https://reviews.llvm.org/D54254

llvm-svn: 346479
2018-11-09 09:18:00 +00:00
Dean Michael Berris da375a67f8 [XRay] Improve FDR trace handling and error messaging
Summary:
This change covers a number of things spanning LLVM and compiler-rt,
which are related in a non-trivial way.

In LLVM, we have a library that handles the FDR mode even log loading,
which uses C++'s runtime polymorphism feature to better faithfully
represent the events that are written down by the FDR mode runtime. We
do this by interpreting a trace that's serliased in a common format
agreed upon by both the trace loading library and the FDR mode runtime.
This library is under active development, which consists of features
allowing us to reconstitute a higher-level event log.

This event log is used by the conversion and visualisation tools we have
for interpreting XRay traces.

One of the tools we have is a diagnostic tool in llvm-xray called
`fdr-dump` which we've been using to debug our expectations of what the
FDR runtime should be writing and what the logical FDR event log
structures are. We use this fairly extensively to reason about why some
non-trivial traces we're generating with FDR mode runtimes fail to
convert or fail to parse correctly.

One of these failures we've found in manual debugging of some of the
traces we've seen involve an inconsistency between the buffer extents (a
record indicating how many bytes to follow are part of a logical
thread's event log) and the record of the bytes written into the log --
sometimes it turns out the data could be garbage, due to buffers being
recycled, but sometimes we're seeing the buffer extent indicating a log
is "shorter" than the actual records associated with the buffer. This
case happens particularly with function entry records with a call
argument.

This change for now updates the FDR mode runtime to write the bytes for
the function call and arg record before updating the buffer extents
atomically, allowing multiple threads to see a consistent view of the
data in the buffer using the atomic counter associated with a buffer.
What we're trying to prevent here is partial updates where we see the
intermediary updates to the buffer extents (function record size then
call argument record size) becoming observable from another thread, for
instance, one doing the serialization/flushing.

To do both diagnose this issue properly, we need to be able to honour
the extents being set in the `BufferExtents` records marking the
beginning of the logical buffers when reading an FDR trace. Since LLVM
doesn't use C++'s RTTI mechanism, we instead follow the advice in the
documentation for LLVM Style RTTI
(https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html). We then rely on
this RTTI feature to ensure that our file-based record producer (our
streaming "deserializer") can honour the extents of individual buffers
as we interpret traces.

This also sets us up to be able to eventually do smart
skipping/continuation of FDR logs, seeking instead to find BufferExtents
records in cases where we find potentially recoverable errors. In the
meantime, we make this change to operate in a strict mode when reading
logical buffers with extent records.

Reviewers: mboerger

Subscribers: hiraditya, llvm-commits, jfb

Differential Revision: https://reviews.llvm.org/D54201

llvm-svn: 346473
2018-11-09 06:26:48 +00:00
Max Kazantsev 9883d1e1a7 [NFC] Add utility function for SafetyInfo updates for moveBefore
llvm-svn: 346472
2018-11-09 05:39:04 +00:00
Petr Hosek e2f6896eef [llvm-rc] Support joined or separate spelling for /fo flag
CMake invokes rc using the joined spelling which appears to be supported
by Microsoft's rc implementation, so we should support it as well.

Differential Revision: https://reviews.llvm.org/D54191

llvm-svn: 346470
2018-11-09 03:16:53 +00:00
Mandeep Singh Grang 397765bc51 [COFF, ARM64] Add support for MSVC buffer security check
Reviewers: rnk, mstorsjo, compnerd, efriedma, TomTan

Reviewed By: rnk

Subscribers: javed.absar, kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D54248

llvm-svn: 346469
2018-11-09 02:48:36 +00:00
Thomas Lively 2faf079494 [WebAssembly] Read prefixed opcodes as ULEB128s
Summary: Depends on D54126.

Reviewers: aheejin, dschuff, aardappel

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54138

llvm-svn: 346465
2018-11-09 01:57:00 +00:00
Thomas Lively 4ddd22581e [WebAssembly][NFC] Reorder SIMD section
Summary:
Reorders the sections in the SIMD tablegen file to roughly match the
new opcode ordering. Depends on D54126.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54134

llvm-svn: 346464
2018-11-09 01:49:19 +00:00
Thomas Lively 299d214aba [WebAssembly] Renumber and LEB128-encode SIMD opcodes
Reviewers: aheejin, dschuff, aardappel

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54126

llvm-svn: 346463
2018-11-09 01:45:56 +00:00
Thomas Lively 38c902bc2e [WebAssembly] Lower select for vectors
Summary:

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53675

llvm-svn: 346462
2018-11-09 01:38:44 +00:00
Jonas Devlieghere 86ef28f1a2 [not] Improve error reporting consistency.
Makes `not` use WithColor from Support so it prints 'error' in color
when applicable.

llvm-svn: 346460
2018-11-09 01:17:22 +00:00
Jonas Devlieghere f5b6d11cf2 [VFS] Add "expand tilde" argument to getRealPath.
Add an optional argument to expand tildes in the path to mirror llvm's
implementation of the corresponding function.

llvm-svn: 346453
2018-11-09 00:26:10 +00:00
Petr Hosek 1f597e6e6b [llvm-rc] Support absolute filenames in manifests
CMake generate manifests that contain absolute filenames and these
currently result in assertion error. This change ensures that we
handle these correctly.

Differential Revision: https://reviews.llvm.org/D54194

llvm-svn: 346450
2018-11-08 23:45:00 +00:00
Philip Reames 8c7b78767a [docs][statepoint] Document explicitly provided stack slots
Functionality for this was added a while ago, though never documented or extensively tested.  Document it with an explicit warning.

llvm-svn: 346448
2018-11-08 23:20:40 +00:00
Philip Reames e777f013ac [docs][statepoints] add a section spelling out simplifications for non-relocating GCs
llvm-svn: 346447
2018-11-08 23:07:04 +00:00
Philip Reames 78b46457fb [docs] Add some subsections to make it possible to find portions of the statepoint overview
llvm-svn: 346446
2018-11-08 22:56:41 +00:00
Heejin Ahn 0c68a875fa [WebAssembly] Fix LowerEmscriptenEHSjLj when there's only longjmp
Summary:
The pass incorrectly assumed if there's a longjmp declaration in the
module, there is also a setjmp function declaration. Fixed it, and now
the pass only converts longjmp and does not do any other transformation
when there's no setjmp declaration in the module.

Fixes PR39562.

Reviewers: jgravelle-google, sbc100

Subscribers: dschuff, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54273

llvm-svn: 346445
2018-11-08 22:56:26 +00:00
Eli Friedman ab296670e1 [ARM64] [Windows] Improve error reporting for unsupported SEH unwind.
Use report_fatal_error instead of crashing or miscompiling. (It's
currently easier than it should be to hit this case because we don't
reuse codes across epilogs.)

llvm-svn: 346440
2018-11-08 21:20:52 +00:00
Florian Hahn a684a99441 [LoopInterchange] Support reductions across inner and outer loop.
This patch adds logic to detect reductions across the inner and outer
loop by following the incoming values of PHI nodes in the outer loop. If
the incoming values take part in a reduction in the inner loop or come
from outside the outer loop, we found a reduction spanning across inner
and outer loop.

With this change, ~10% more loops are interchanged in the LLVM
test-suite + SPEC2006.

Fixes https://bugs.llvm.org/show_bug.cgi?id=30472

Reviewers: mcrosier, efriedma, karthikthecool, davide, hfinkel, dmgreen

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D43245

llvm-svn: 346438
2018-11-08 20:44:19 +00:00
Craig Topper 8cca8bd4aa [SelectionDAG] Assert on the width of DemandedElts argument to computeKnownBits for all vector typed operations not just build_vector.
Fix AArch64 unit test that fails with the assertion added.

llvm-svn: 346437
2018-11-08 20:29:17 +00:00
Pirama Arumuga Nainar e61652a384 [LTO] Drop non-prevailing definitions only if linkage is not local or appending
Summary:
This fixes PR 37422

In ELF, non-weak symbols can also be non-prevailing.  In this particular
PR, the __llvm_profile_* symbols are non-prevailing but weren't getting
dropped - causing multiply-defined errors with lld.

Also add a test, strong_non_prevailing.ll, to ensure that multiple
copies of a strong symbol are dropped.

To fix the test regressions exposed by this fix,
- do not mark prevailing copies for symbols with 'appending' linkage.
There's no one prevailing copy for such symbols.
- fix the prevailing version in dead-strip-fulllto.ll
- explicitly pass exported symbols to llvm-lto in fumcimport.ll and
funcimport_var.ll

Reviewers: tejohnson, pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith,
dang, srhines, llvm-commits

Differential Revision: https://reviews.llvm.org/D54125

llvm-svn: 346436
2018-11-08 20:10:07 +00:00
Simon Pilgrim 0b01062dba [X86] Regenerate loaduse test
llvm-svn: 346434
2018-11-08 19:42:11 +00:00
Sanjay Patel b5535dc7b3 [x86] use shuffles for scalar insertion into high elements of a constant vector
As discussed in D54073, we have a potential regression from more aggressive vector narrowing here, so let's try to avoid that by changing build-vector lowering slightly.

Insert-vector-element lowering always does this since there's no "pinsr" for ymm/zmm:

// If the vector is wider than 128 bits, extract the 128-bit subvector, insert
// into that, and then insert the subvector back into the result.

...but we can sometimes do better for insert-into-constant-vector by using shuffle lowering.

Differential Revision: https://reviews.llvm.org/D54271

llvm-svn: 346433
2018-11-08 19:16:27 +00:00
Nirav Dave 6ce9f72f76 [DAGCombine] Improve alias analysis for chain of independent stores.
FindBetterNeighborChains simulateanously improves the chain
dependencies of a chain of related stores avoiding the generation of
extra token factors. For chains longer than the GatherAllAliasDepths,
stores further down in the chain will necessarily fail, a potentially
significant waste and preventing otherwise trivial parallelization.

This patch directly parallelize the chains of stores before improving
each store. This generally improves DAG-level parallelism.

Reviewers: courbet, spatel, RKSimon, bogner, efriedma, craig.topper, rnk

Subscribers: sdardis, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D53552

llvm-svn: 346432
2018-11-08 19:14:20 +00:00
Zachary Turner 056e4ab497 [NativePDB] Higher fidelity reconstruction of AST from Debug Info.
In order to accurately put a type into the correct location in the AST
we construct from debug info, we need to be able to determine what
DeclContext (namespace, global, nested class, etc) that it goes into.
PDB doesn't contain this mapping.  It does, however, contain the reverse
mapping.  That is, for a given class type T, you can determine all
classes Q1, Q2, ..., Qn that are nested inside of T.  We need to know,
for a given class type Q, what type T is it nested inside of.

This patch builds this map as a pre-processing step when we first
load the PDB by scanning every type.  Initial tests show that while
this can be slow in debug builds of LLDB, it is quite fast in release
builds (less than 2 seconds for a ~1GB PDB, and it only needs to happen
once).

Furthermore, having this pre-processing step in place allows us to
repurpose it for building up other kinds of indexing to it down the
line.  For the time being, this gives us very accurate reconstruction
of the DeclContext hierarchy.

Differential Revision: https://reviews.llvm.org/D54216

llvm-svn: 346429
2018-11-08 18:50:11 +00:00
Sanjay Patel c4f719feb0 [x86] add RUNs for AVX1; NFC
Differences in splat-ability might be reason to differentiate some cases.

llvm-svn: 346426
2018-11-08 18:18:20 +00:00
Roman Lebedev 3817292069 [NFC][BdVer2] Load and store throughput tests: also check sched stats (PR39465)
As noted by Andrea Di Biagio in https://bugs.llvm.org/show_bug.cgi?id=39465
both the loads and stores occupy both the store and load queues.
This is clearly wrong.

llvm-svn: 346425
2018-11-08 18:15:58 +00:00
Matt Davis 9e719705d0 [llvm-mca] Partially revert r346417.
Restored the llvm:: namespace qualifier on make_unique.
This removes the ambiguity with make_unique.  

llvm-svn: 346424
2018-11-08 18:08:43 +00:00
Nicolai Haehnle 6979b70427 Add test case for the regression caused by r344696
(That change has since been reverted.)

Reduced from https://bugs.freedesktop.org/show_bug.cgi?id=108611

llvm-svn: 346423
2018-11-08 18:01:38 +00:00
Tom Stellard 28d662164d InstCombine: Avoid introducing poison values when lowering llvm.amdgcn.[us]bfe
Summary:
When the 3rd argument to these intrinsics is zero, lowering them
to shift instructions produces poison values, since we end up with
shift amounts equal to the number of bits in the shifted value.  This
means we can only lower these intrinsics if we can prove that the
3rd argument is not zero.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: bnieuwenhuizen, jvesely, wdng, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D53739

llvm-svn: 346422
2018-11-08 17:57:57 +00:00
Vedant Kumar d6699423f1 [CodeExtractor] Mark functions noreturn when applicable
This eliminates the outlining penalty for llvm.trap/unreachable, because
callers no longer have to emit cleanup/ret instructions after calling an
outlined `noreturn` function.

rdar://45523626

llvm-svn: 346421
2018-11-08 17:57:09 +00:00
Andrea Di Biagio d66f4e472a [llvm-mca] PR39261: Rename FetchStage to EntryStage.
This fixes PR39261.

FetchStage is a misnomer. It causes confusion with the frontend fetch stage,
which we don't currently simulate.  I decided to rename it into EntryStage
mainly because this is meant to be a "source" stage for all pipelines.

Differential Revision: https://reviews.llvm.org/D54268

llvm-svn: 346419
2018-11-08 17:49:30 +00:00
Matt Davis 08b64d60fe [llvm-mca] Remove unneeded namespace qualifier. NFC.
llvm-svn: 346417
2018-11-08 17:32:45 +00:00
Philip Reames 9ffd5eb081 [docs] Clarify ELF section naming for StackMaps and fix a typo
llvm-svn: 346416
2018-11-08 17:20:35 +00:00
Adrian Prantl 778fba3188 [dsymutil] Copy the LC_BUILD_VERSION load command into the companion binary.
LC_BUILD_VERSION contains platform information that is useful for LLDB
to match up dSYM bundles with binaries. This patch copies the load
command over into the dSYM.

rdar://problem/44145175
rdar://problem/45883463

Differential Revision: https://reviews.llvm.org/D54233

llvm-svn: 346412
2018-11-08 16:54:59 +00:00