Commit Graph

128536 Commits

Author SHA1 Message Date
Craig Topper fee9067261 [X86] Move all the FP_TO_XINT/XINT_TO_FP setOperationActions into the same !useSoftFloat block. Qualify all of the Promote actions for these with !useSoftFloat too. NFCI
The Promote action doesn't apply until LegalizeDAG. By the time
we get there, we would have already softened all the FP operations
if useSoftFloat was true. So there wouldn't be any operation left
to Promote.
2019-11-13 14:07:54 -08:00
Hiroshi Yamauchi 3f0969daf9 [PGO][PGSO] Temporarily disable the large working set size behavior.
Summary:
This temporarily disables the large working set size behavior in profile guided
size optimization due to internal benchmark regressions.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70207
2019-11-13 14:00:47 -08:00
Sanjay Patel a3e61946c5 [SLP] fix miscompile on min/max reductions with extra uses (PR43948)
The bug manifests as replacing a reduction operand with an undef
value.

The problem appears to be limited to cases where a min/max reduction
has extra uses of the compare operand to the select.

In the general case, we are tracking "ExternallyUsedValues" and
an "IgnoreList" of the reduction operations, but those may not apply
to the final compare+select in a min/max reduction.

For that, we use replaceAllUsesWith (RAUW) to ensure that the new
vectorized reduction values are transferred to all subsequent users.

Differential Revision: https://reviews.llvm.org/D70148
2019-11-13 15:57:35 -05:00
Craig Topper 84e83b54bd [TargetLowering] Increase the storage size of NumRegistersForVT to allow the type break down for v256i1 and other types to be stored correctly
v256i1 on X86 without avx512 breaks down to 256 i8 values when passed between basic blocks. But the NumRegistersForVT was sized at a byte for each VT. This results in 256 being stored as 0.

This patch enlarges the type to 16 bits and adds an assert to ensure that no information is lost when the entry is stored.

Differential Revision: https://reviews.llvm.org/D70138
2019-11-13 12:09:35 -08:00
Simon Atanasyan 63bbbcde9f [mips] Reduce number of nested `if` statements. NFC 2019-11-13 22:57:55 +03:00
Simon Atanasyan 3216d28449 [mips] Add tests to check `jal sym+offset`. NFC 2019-11-13 22:57:54 +03:00
Quentin Colombet de94cda81b [LiveInterval] Allow updating subranges with slightly out-dated IR
During register coalescing, we update the live-intervals on-the-fly.
To do that we are in this strange mode where the live-intervals can
be slightly out-of-sync (more precisely they are forward looking)
compared to what the IR actually represents.
This happens because the register coalescer only updates the IR when
it is done with updating the live-intervals and it has to do it this
way because updating the IR on-the-fly would actually clobber some
information on how the live-ranges that are being updated look like.

This is problematic for updates that rely on the IR to accurately
represents the state of the live-ranges. Right now, we have only
one of those: stripValuesNotDefiningMask.
To reconcile this need of out-of-sync IR, this patch introduces a
new argument to LiveInterval::refineSubRanges that allows the code
doing the live range updates to reason about how the code should
look like after the coalescer will have rewritten the registers.
Essentially this captures how a subregister index with be offseted
to match its position in a new register class.

E.g., let say we want to merge:
    V1.sub1:<2 x s32> = COPY V2.sub3:<4 x s32>

We do that by choosing a class where sub1:<2 x s32> and sub3:<4 x s32>
overlap, i.e., by choosing a class where we can find "offset + 1 == 3".
Put differently we align V2's sub3 with V1's sub1:
    V2: sub0 sub1 sub2 sub3
    V1: <offset>  sub0 sub1

This offset will look like a composed subregidx in the the class:
     V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32>
 =>  V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32>

Now if we didn't rewrite the uses and def of V1, all the checks for V1
need to account for this offset to match what the live intervals intend
to capture.

Prior to this patch, we would fail to recognize the uses and def of V1
and would end up with machine verifier errors: No live segment at def.
This could lead to miscompile as we would drop some live-ranges and
thus, miss some interferences.

For this problem to trigger, we need to reach stripValuesNotDefiningMask
while having a mismatch between the IR and the live-ranges (i.e.,
we have to apply a subreg offset to the IR.)

This requires the following three conditions:
1. An update of overlapping subreg lanes: e.g., dsub0 == <ssub0, ssub1>
2. An update with Tuple registers with a possibility to coalesce the
   subreg index: e.g., v1.dsub_1 == v2.dsub_3
3. Subreg liveness enabled.

looking at the IR to decide what is alive and what is not, i.e., calling
stripValuesNotDefiningMask.
coalescer maintains for the live-ranges information.

None of the targets that currently use subreg liveness (i.e., the targets
that fulfill #3, Hexagon, AMDGPU, PowerPC, and SystemZ IIRC) expose #1 and
and #2, so this patch also artificial enables subreg liveness for ARM,
so that a nice test case can be attached.
2019-11-13 11:17:56 -08:00
Ahmed Bougacha 7313d7d618 [AArch64][v8.3a] Add missing imp-defs on RETA*.
RETA always implicitly uses LR, unlike RET which merely has an
alias that defaults it to LR.
Additionally, RETA implicitly uses SP as well, which it uses as
a discriminator to authenticate LR.

This isn't usually noticeable, because RET_ReallyLR is used in most
of the backend.  However, the post-RA scheduler, if enabled, will
cause miscompiles if the imp-uses are missing.

While there, fix a typo in the lone affected testcase.
2019-11-13 10:38:11 -08:00
Ahmed Bougacha 643ac6c042 [AArch64][v8.3a] Add LDRA '[xN]!' alias.
The instruction definition has been retroactively expanded to
allow for an alias for '[xN, 0]!' as '[xN]!'.
That wouldn't make sense on LDR, but does for LDRA.
2019-11-13 10:38:11 -08:00
David Stenberg 7417cc149b Fix typo in DwarfDebug [NFC] 2019-11-13 18:06:16 +01:00
Sanjay Patel e9bf7a60a0 [SLP] reduce code duplication for min/max vs. other reductions; NFCI 2019-11-13 11:26:08 -05:00
Matthew Malcomson e5f3760e8c Fix comment spelling {addresing -> addressing} (NFC) 2019-11-13 16:14:32 +00:00
Sanjay Patel 3d6b53980c [InstCombine] propagate fast-math-flags (FMF) to select when inverting fcmp+select
As noted by the FIXME comment, this is not correct based on our current FMF semantics.
We should be propagating FMF from the final value in a sequence (in this case the
'select'). So the behavior even without this patch is wrong, but we did not allow FMF
on 'select' until recently.

But if we do the correct thing right now in this patch, we'll inevitably introduce
regressions because we have not wired up FMF propagation for 'phi' and 'select' in
other passes (like SimplifyCFG) or other places in InstCombine. I'm not seeing a
better incremental way to make progress.

That said, the potential extra damage over the existing wrong behavior from this
patch is very limited. AFAIK, the only way to have different FMF on IR in the same
function is if we have LTO inlined IR from 2 modules that were compiled using
different fast-math settings.

As seen in the tests, we may actually see some improvements with this patch because
adding the FMF to the 'select' allows matching to min/max intrinsics that were
previously missed (in the common case, the 'fcmp' and 'select' should have identical
FMF to begin with).

Next steps in the transition:

    Make similar changes in instcombine as needed.
    Enable phi-to-select FMF propagation in SimplifyCFG.
    Remove dependencies on fcmp with FMF.
    Deprecate FMF on fcmp.

Differential Revision: https://reviews.llvm.org/D69720
2019-11-13 10:38:42 -05:00
Pavel Labath 1eea3fa063 DWARFDebugLoclists: Add an api to get the location lists of a DWARF unit
Summary:
This avoid the need to duplicate the location lists searching logic in
various users. The "inline location list dumping" code (which is the
only user actually updated to handle DWARF v5 location lists)  is
switched to this method. After adding v4 location list support, I'll
switch other users too.

Reviewers: dblaikie, probinson, JDevlieghere, aprantl, SouraVX

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70084
2019-11-13 16:26:16 +01:00
Simon Pilgrim 86f07e826f PowerPC - fix uninitialized variable warnings. NFCI. 2019-11-13 14:40:21 +00:00
Simon Pilgrim e1670175f2 Fix uninitialized variable warning. NFCI. 2019-11-13 14:40:21 +00:00
Simon Pilgrim 29a5a6eed0 Fix uninitialized variable warning. NFCI. 2019-11-13 14:40:21 +00:00
Simon Pilgrim 6ebc5089b2 Fix uninitialized variable warning. NFCI. 2019-11-13 14:40:20 +00:00
Simon Pilgrim b3be859baa Sparc - fix uninitialized variable warnings. NFCI. 2019-11-13 14:40:20 +00:00
Simon Pilgrim 66f2ed0746 PPCReduceCRLogicals - fix static analyzer warnings. NFC
- Fix uninitialized variable warnings.
- Fix null dereference warnings.
2019-11-13 14:40:20 +00:00
Simon Pilgrim d1bd5e476b SLPVectorizer - make comparison operators + isInSchedulingRegion const
Fixes cppcheck warnings.
2019-11-13 14:40:19 +00:00
Florian Hahn f7499011ca [InstCombine] Avoid moving ops that do restrict undef across shuffles.
I think we have to be a bit more careful when it comes to moving
ops across shuffles, if the op does restrict undef. For example, without
this patch, we would move 'and %v, <0, 0, -1, -1>' over a
'shufflevector %a, undef, <undef, undef, 1, 2>'. As a result, the first
2 lanes of the result are undef after the combine, but they really
should be 0, unless I am missing something.

For ops that do fold to undef on undef operands, the current behavior
should be fine. I've add conservative check OpDoesRestrictUndef, maybe
there's a better existing utility?

Reviewers: spatel, RKSimon, lebedev.ri

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D70093
2019-11-13 13:40:34 +00:00
Luís Marques c5b56caa32 Revert "[RISCV] Fix wrong CFI directives"
test/DebugInfo/RISCV/relax-debug-frame.ll wasn't properly updated.
2019-11-13 13:28:33 +00:00
Sjoerd Meijer d90804d26b [ARM][MVE] canTailPredicateLoop
This implements TTI hook 'preferPredicateOverEpilogue' for MVE.  This is a
first version and it operates on single block loops only. With this change, the
vectoriser will now determine if tail-folding scalar remainder loops is
possible/desired, which is the first step to generate MVE tail-predicated
vector loops.

This is disabled by default for now. I.e,, this is depends on option
-disable-mve-tail-predication, which is off by default.

I will follow up on this soon with a patch for the vectoriser to respect loop
hint 'vectorize.predicate.enable'. I.e., with this loop hint set to Disabled,
we don't want to tail-fold and we shouldn't query this TTI hook, which is
done in D70125.

Differential Revision: https://reviews.llvm.org/D69845
2019-11-13 13:24:33 +00:00
Luís Marques a5ce8bd715 [RISCV] Fix wrong CFI directives
Summary: Removes CFI CFA directives that could incorrectly propagate
beyond the basic block they were inteded for. Specifically it removes
the epilogue CFI directives. See the branch_and_tail_call test for an
example of the issue. Should fix the stack unwinding issues caused by
the incorrect directives.

Reviewers: asb, lenary, shiva0217
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69723
2019-11-13 13:06:15 +00:00
Simon Pilgrim 4d0e7b628a [X86][AVX] Add plausible schedule classes to MASKPAIR/VP2INTERSECT/VDPBF16PS instructions
These are really just placeholders that use approximately the right resources - once we have CPUs scheduler models that support these instructions they will need revisiting.

In the meantime this means that all instructions have a class of some kind., meaning models can be more easily flagged as complete.
2019-11-13 12:02:01 +00:00
Hans Wennborg 6ea4775900 Revert 57dd4b0 "[ValueTracking] Allow context-sensitive nullness check for non-pointers"
This caused miscompiles of Chromium (https://crbug.com/1023818). The reduced
repro is small enough to fit here:

  $ cat /tmp/a.c
  unsigned char f(unsigned char *p) {
    unsigned char result = 0;
    for (int shift = 0; shift < 1; ++shift)
      result |= p[0] << (shift * 8);
    return result;
  }
  $ bin/clang -O2 -S -o - /tmp/a.c | grep -A4 f:
  f:                                      # @f
          .cfi_startproc
  # %bb.0:                                # %entry
          xorl    %eax, %eax
          retq

That's nicely optimized, but I don't think it's the right result :-)

> Same as D60846 but with a fix for the problem encountered there which
> was a missing context adjustment in the handling of PHI nodes.
>
> The test that caused D60846 to be reverted was added in e15ab8f277.
>
> Reviewers: nikic, nlopes, mkazantsev,spatel, dlrobertson, uabelho, hakzsam
>
> Subscribers: hiraditya, bollu, llvm-commits
>
> Tags: #llvm
>
> Differential Revision: https://reviews.llvm.org/D69571

This reverts commit 57dd4b03e4.
2019-11-13 12:19:02 +01:00
Mirko Brkusanin fed17867cd [Mips] Add rematerialization support for ldi.fmt
Instruction ldi.fmt can be considered cheap enough to avoid spill and restore
of value that it produces since it's loaded from immediate.

Differential Revision: https://reviews.llvm.org/D69898
2019-11-13 11:33:52 +01:00
Simon Atanasyan 068db2ed4d [mips] Show an error if 64-bit target triple provided with 32-bit CPU
When a 64-bit triple is used emit an error if the CPU only supports
32-bit code.

Patch by Miloš Stojanović.

Differential Revision: https://reviews.llvm.org/D70018
2019-11-13 13:32:39 +03:00
Daniil Suchkov cba4a27745 Temporarily revert "[InstCombine] Fold PHIs with equal incoming pointers"
Revert due to sanitizer-windows buildbot failure.

This reverts commit bbb29738b5.
2019-11-13 17:14:11 +07:00
David Stenberg 5e646ff530 [DebugInfo] Avoid creating entry values for clobbered registers
Summary:
Entry values are considered for parameters that have register-described
DBG_VALUEs in the entry block (along with other conditions).

If a parameter's value has been propagated from the caller to the
callee, then the parameter's DBG_VALUE in the entry block may be
described using a register defined by some instruction, and entry values
should not be emitted for the parameter, which can currently occur.
One such case was seen in the attached test case, in which the second
parameter, which is described by a redefinition of the first parameter's
register, would incorrectly get an entry value using the first
parameter's register. This commit intends to solve such cases by keeping
track of register defines, and ignoring DBG_VALUEs in the entry block
that are described by such registers.

In a RelWithDebInfo build of clang-8, the average size of the set was
27, and in a RelWithDebInfo+ASan build it was 30.

Reviewers: djtodoro, NikolaPrica, aprantl, vsk

Reviewed By: djtodoro, vsk

Subscribers: hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D69889
2019-11-13 11:10:47 +01:00
David Stenberg 4fec44cd61 [DebugInfo] Add helper for finding entry value candidates [NFC]
Summary:
The conditions that are used to determine if entry values should be
emitted for a parameter are quite many, and will grow slightly
in a follow-up commit, so move those to a helper function, as was
suggested in the code review for D69889.

Reviewers: djtodoro, NikolaPrica

Reviewed By: djtodoro

Subscribers: probinson, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69955
2019-11-13 11:10:47 +01:00
Sander de Smalen 3367686b4d [AArch64] Extend storeRegToStackSlot to spill SVE registers.
This patch allows the register allocator to spill SVE registers to the stack.

Reviewers: ostannard, efriedma, rengolin, cameron.mcinally

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D70082
2019-11-13 10:09:32 +00:00
Daniil Suchkov bbb29738b5 [InstCombine] Fold PHIs with equal incoming pointers
In case when all incoming values of a PHI are equal pointers, this
transformation inserts a definition of such a pointer right after
definition of the base pointer and replaces with this value both PHI and
all it's incoming pointers. Primary goal of this transformation is
canonicalization of this pattern in order to enable optimizations that
can't handle PHIs. Non-inbounds pointers aren't currently supported.

Reviewers: spatel, RKSimon, lebedev.ri, apilipenko

Reviewed By: apilipenko

Tags: #llvm

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D68128
2019-11-13 17:00:34 +07:00
Sander de Smalen 9a1c243aa5 [AArch64][SVE] Allocate locals that are scalable vectors.
This patch adds a target interface to set the StackID for a given type,
which allows scalable vectors (e.g. `<vscale x 16 x i8>`) to be assigned a
'sve-vec' StackID, so it is allocated in the SVE area of the stack frame.

Reviewers: ostannard, efriedma, rengolin, cameron.mcinally

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D70080
2019-11-13 09:45:24 +00:00
Simon Tatham 5b9e4daef0 [ARM,MVE] Use VMOV.{S8,S16} for sign-extended extractelement.
MVE includes instructions that extract an 8- or 16-bit lane from a
vector and sign-extend it into the output 32-bit GPR. `ARMInstrMVE.td`
already included isel patterns to select those instructions in
response to the `ARMISD::VGETLANEs` selection-DAG node type. But
`ARMISD::VGETLANEs` was never actually generated, because the code
that creates it was conditioned on NEON only.

It's an easy fix to enable the same code for integer MVE, and now IR
that sign-extends the result of an extractelement (whether explicitly
or as part of the function call ABI) will use `vmov.s8` instead of
`vmov.u8` followed by `sxtb`.

Reviewers: SjoerdMeijer, dmgreen, ostannard

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70132
2019-11-13 09:08:41 +00:00
joanlluch d384ad6b63 [TargetLowering][DAGCombine][MSP430] Shift Amount Threshold in DAGCombine (4)
Summary:
Replaces
```
unsigned getShiftAmountThreshold(EVT VT)
```
by

```
bool shouldAvoidTransformToShift(EVT VT, unsigned amount)
```
thus giving more flexibility for targets to decide whether particular shift amounts must be considered expensive or not.

Updates the MSP430 target with a custom implementation.

This continues  D69116, D69120, D69326 and updates them, so all of them must be committed before this.

Existing tests apply, a few more have been added.

Reviewers: asl, spatel

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70042
2019-11-13 09:23:08 +01:00
Craig Topper a4b7613a49 [X86] Remove setOperationAction for FP_TO_SINT v8i16.
This is no longer needed after widening legalization as we
custom legalize v8i8 ourselves.

Added entries to the cost model, but bumped the cost slightly
to account for the truncate shuffle that wasn't costed before.
2019-11-12 22:45:52 -08:00
Francesco Petrogalli d8b6b11143 [VFABI] Add LLVM internal mangling for vector functions.
Summary:
This patch adds a custom ISA for vector functions for internal use
in LLVM. The <isa> token is set to "_LLVM_", and it is not attached
to any specific instruction Vector ISA, or Vector Function ABI.

The ISA is used as a token for handling Vector Function ABI-style
vectorization for those vector functions that are not directly
associated to any existing Vector Function ABI (for example, some of
the vector functions exposed by TargetLibraryInfo). The demangling
function for this ISA in a Vector Function ABI context is set to be
the same as the common one shared between X86 and AArch64.

Reviewers: jdoerfert, sdesmalen, simoll

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70089
2019-11-13 03:26:39 +00:00
Matt Arsenault 9d7bccab66 AMDGPU: Extend add x, (ext setcc) combine to sub
This is the same as the add case, but inverts the operation type.

This avoids regressions in a future patch.
2019-11-13 07:13:58 +05:30
Matt Arsenault 4b47213951 AMDGPU: Switch backend default max workgroup size to 1024
Previously this would default to 256, not the maximum supported size
of 1024. Using a maximum lower than the hardware maximum requires
language runtimes to enforce this limit for correctness, which no
language has correctly done. Switch the default to the conservatively
correct maximum, and force frontends to opt-in to the more optimal 256
default maximum.

I don't really understand why the changes in occupancy-levels.ll
increased the computed occupancy, which I expected to decrease. I'm
not sure if these tests should be forcing the old maximum.
2019-11-13 07:11:02 +05:30
Matt Arsenault 25c5da5a42 AMDGPU Reduce reported maximum group size to 1024
While some targets allow encoding 2048, this was never tested or
supported.
2019-11-13 06:34:28 +05:30
Eric Christopher 7a3ad48d6d Temporarily Revert "Reapply [LVI] Normalize pointer behavior" as it's broken python 3.6.
Reverting to figure out if it's a problem in python or the compiler for now.

This reverts commit 885a05f48a.
2019-11-12 15:51:51 -08:00
Craig Topper 3e1aee2ba7 [X86] Don't consider v64i1 as a legal type unless v64i8 is also a legal type.
This avoids some nasty issues with argument passing and lowering of
arbitrary v64i8 shuffles.
2019-11-12 14:56:02 -08:00
Craig Topper 0f04ffc073 [X86] Only pass v64i8/v32i16 as v16i32 on non-avx512bw targets if the v16i32 type won't be split by prefer-vector-width=256
Otherwise just let the v64i8/v32i16 types be split to v32i8/v16i16.

In reality this shouldn't happen because it means we have a 512-bit
vector argument, but min-legal-vector-width says a value less than
512. But a 512-bit argument should have been factored into the
preferred vector width.
2019-11-12 14:56:01 -08:00
Yonghong Song 166cdc0281 [BPF] generate BTF_KIND_VARs for all non-static globals
Enable to generate BTF_KIND_VARs for non-static
default-section globals which is not allowed previously.
Modified the existing test case to accommodate the new change.

Also removed unused linkage enum members VAR_GLOBAL_TENTATIVE and
VAR_GLOBAL_EXTERNAL.

Differential Revision: https://reviews.llvm.org/D70145
2019-11-12 14:34:08 -08:00
Alina Sbirlea db69f1b229 [GlobalsAA] Restrict ModRef result if any internal method has its address taken.
Summary:
If there are any internal methods whose address was taken, conclude there is nothing known in relation of any other internal method and a global.

Reviewers: nlopes, sanjoy.google

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69690
2019-11-12 14:24:56 -08:00
Alina Sbirlea 4ae74cc99f [GVNHoist] Preserve AAResults.
Resolves PR38906, PR40898.
2019-11-12 14:10:04 -08:00
Evandro Menezes 9b1e86f0cb [AArch64] Update for Exynos
Fix the modeling for loads and stores using the register offset addresing mode.
2019-11-12 14:37:41 -06:00
Evandro Menezes 98856e3943 [AArch64] Fix addressing mode predicates
Fix predicates related to the register offset addressing mode.
2019-11-12 14:37:28 -06:00
Peter Collingbourne 1549b4699a ARM: Don't emit R_ARM_NONE relocations to compact unwinding decoders in .ARM.exidx on Android.
These relocations are specified by the ARM EHABI (section 6.3). As I understand
it, their purpose is to accommodate unwinder implementations that wish to
reduce code size by placing the implementations of the compact unwinding
decoders in a separate translation unit, and using extern weak symbols to
refer to them from the main unwinder implementation, so that they are only
linked when something in the binary needs them in order to unwind.

However, neither of the unwinders used on Android (libgcc, LLVM libunwind)
use this technique, and in fact emitting these relocations ends up being
counterproductive to code size because they cause a copy of the unwinder
to be statically linked into most binaries, regardless of whether it is
actually needed. Furthermore, these relocations create circular dependencies
(between libc and the unwinder) in cases where the unwinder is dynamically
linked and libc contains compact unwind info.

Therefore, deviate from the EHABI here and stop emitting these relocations
on Android.

Differential Revision: https://reviews.llvm.org/D70027
2019-11-12 10:52:59 -08:00
Krzysztof Parzyszek ef150e2ea5 [Hexagon] Update PS_aligna with max stack alignment once isel completes 2019-11-12 11:47:29 -06:00
Krzysztof Parzyszek 592dd45924 [Hexagon] Fix vector spill expansion to use proper alignment
1. Add pseudos PS_vloadrv_ai and PS_vstorerv_ai: those are now used
   for single vector registers in loadRegFromStackSlot (and store...).
2. Remove pseudos PS_vloadrwu_ai and PS_vstorerwu_ai. The alignment is
   now checked when expanding spill pseudos (both in frame lowering
   and in expand-post-ra-pseudos), and a proper instruction is generated.
3. Update MachineMemOperands when dealigning vector spill slots.
4. Return vector predicate registers in getCallerSavedRegs.
2019-11-12 09:43:21 -06:00
Krzysztof Parzyszek e3eb10c541 [Hexagon] Convert stack object offsets to int64, NFC
This will print [SP-56] instead of [SP+4294967240].
2019-11-12 09:43:21 -06:00
Krzysztof Parzyszek 67294c97fb [Hexagon] Handle stack realignment in hexagon-vextract 2019-11-12 09:43:21 -06:00
Krzysztof Parzyszek 0a58ef5eb5 [Hexagon] Require PS_aligna whenever variable-sized objects are present 2019-11-12 09:43:21 -06:00
Tom Weaver 41c3f76dcd [DBG][OPT] Attempt to salvage or undef debug info when removing trivially deletable instructions in the Reassociate Expression pass.
Reviewed By: aprantl, vsk

Differential revision: https://reviews.llvm.org/D69943
2019-11-12 15:17:04 +00:00
Jinsong Ji 4cc0c2998d [PowerPC][NFC]Fix typo in desc for enable-ppc-prefetching 2019-11-12 14:46:57 +00:00
Alex Denisov 5022a5fcae Mark llvm::ConstantExpr::getAsInstruction as const
Summary:
getAsInstruction is the only non-const member method.
It is impossible to enforce const-correctness because of it.

Reviewers: jmolloy, majnemer

Reviewed By: jmolloy

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70113
2019-11-12 14:24:12 +01:00
Florian Hahn 636412bf31 [AArch64ExpandPseudos] Preserve renamable state when expanding MOVi64 & co.
If the MOVi operand was renamable, the operands of the expanded
instructions are also renamable.

Reviewers: thegameg, samparker, zatrazz

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D70061
2019-11-12 11:29:04 +00:00
Diana Picus 7f1dcc8952 [InstCombine] Skip scalable vectors in combineLoadToOperationType
Don't try to canonicalize loads to scalable vector types to loads
of integers.

This removes one assertion when trying to use a TypeSize as a parameter
to DataLayout::isLegalInteger. It does not handle the second part of the
function (which looks at bitcasts).

This patch also contains a NFC fix for Load Analysis, where a variable
initialization that would cause the same assertion is moved closer to
its use. This allows us to run the new test for InstCombine without
having to teach LocationSize to play nicely with scalable vectors.

Differential Revision: https://reviews.llvm.org/D70075
2019-11-12 12:27:09 +01:00
Simon Pilgrim 6da34a8b84 FileCheckPattern::FindRegexVarEnd - make helper function static. NFC
Fixes cppcheck warning.
2019-11-12 11:14:19 +00:00
Florian Hahn 1ee93240c0 [LoopInterchange] Only skip PHIs with incoming values from the inner loop.
Currently we have limited support for outer loops with multiple basic
blocks after the inner loop exit. But the current checks for creating
PHIs for loop exit values only assumes the header and latches of the
outer loop. It is better to just skip incoming values defined in the
original inner loops. Those are handled earlier.

Reviewers: efriedma, mcrosier

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D70059
2019-11-12 10:30:51 +00:00
Pavel Labath ebe2f56030 DWARFDebugLoclists: add location list "interpretation" logic
Summary:
This patch extracts the logic for computing the "absolute" locations,
which was partially present in the debug_loclists dumper, completes it,
and moves it into a separate function. This makes it possible to later
reuse the same logic for uses other than dumping.

The dumper is changed to reuse the location list interpreter, and its
format is changed somewhat. In "verbose" mode it prints the "raw" value
of a location list, the interpreted location (if available) and the
expression itself. In non-verbose mode it prints only one of the
location forms: it prefers the interpreted form, but falls back to the
"raw" format if interpretation is not possible (for instance, because we
were not given a base address, or the resolution of indirect addresses
failed).

This patch also undos some of the changes made in D69672, namely the
part about making all functions static. The main reason for this is that
I learned that the original approach (dumping only fully resolved
locations) meant that it was impossible to rewrite one of the existing
tests. To make that possible (and make the "inline location" dump work
in more cases), I now reuse the same dumping mechanism as is used for
section-based dumping. As this required having more objects know about
the various location lists classes, it seemed like a good idea to create
an interface abstracting the difference between them.

Therefore, I now create a DWARFLocationTable class, which will serve as
a base class for the location list classes. DWARFDebugLoclists is made
to inherit from that. DWARFDebugLoc will follow.

Another positive effect of this change is that section-based dumping
code will not need to use templates (as originally) envisioned, and that
the argument lists of the dumping functions become shorter.

Reviewers: dblaikie, probinson, JDevlieghere, aprantl, SouraVX

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70081
2019-11-12 10:40:13 +01:00
Tim Renouf 07ebd74154 MCP: Fixed bug with dest overlapping copy source
In MachineCopyPropagation, when propagating the source of a copy into
the operand of a later instruction, bail if a destination overlaps
(partly defines) the copy source. If the instruction where the
substitution is happening is also a copy, allowing the propagation
confuses the tracking mechanism.

Differential Revision: https://reviews.llvm.org/D69953

Change-Id: Ic570754f878f2d91a4a50a9bdcf96fbaa240726d
2019-11-12 08:18:11 +00:00
Craig Topper ff1504da6f [X86] Update stale comment. NFC 2019-11-11 23:55:12 -08:00
Georgii Rymar dd101539da [yaml2obj/obj2yaml] - Add support for SHT_LLVM_LINKER_OPTIONS sections.
SHT_LLVM_LINKER_OPTIONS section contains pairs of null-terminated strings.
This patch adds support for them.

Differential revision: https://reviews.llvm.org/D69895
2019-11-12 09:55:20 +03:00
Hideto Ueno 88b04ef832 [Attributor] Use must-be-executed-context in align deduction
Summary:
This patch introduces align attribute deduction for callsite argument, function argument, function returned and floating value based on must-be-executed-context.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69797
2019-11-12 06:41:19 +00:00
Nick Terrell 43ff634772 [Support] Optimize SHA1 implementation
* Add inline to the helper functions because gcc-9 won't inline all of
  them without the hint. I've avoided `__attribute__((always_inline))`
  because gcc and clang will inline without it, and improves
  compatibility.
* Replace the byte-by-byte copy in update() with endian::readbe32()
  since perf reports that 1/2 of the time is spent copying into the
  buffer before this patch.

When lld uses --build-id=sha1 it spends 30-45% of CPU in SHA1 depending on the binary (not wall-time since it is parallel). This patch speeds up SHA1 by a factor of 2 on clang-8 and 3 on gcc-6. This leads to a >10% improvement in overall linking time.

lld-speed-test benchmarks run on an Intel i9-9900k with Turbo disabled on CPU 0 compiled with clang-9. Stats recorded with `perf stat -r 5`. All inputs are using `--build-id=sha1`.

| Input | Before (seconds) | After (seconds) |
| --- | --- | --- |
| chrome | 2.14 | 1.82 (-15%) |
| chrome-icf | 2.56 | 2.29 (-10%) |
| clang | 0.65 | 0.53 (-18%) |
| clang-fsds | 0.69 | 0.58 (-16%) |
| clang-gdb-index | 21.71 | 19.3 (-11%) |
| gold | 0.42 | 0.34 (-19%) |
| gold-fsds | 0.431 | 0.355 (-17%) |
| linux-kernel | 0.625 | 0.575 (-8%) |
| llvm-as | 0.045 | 0.039 (-14%) |
| llvm-as-fsds | 0.035 | 0.039 (-11%) |
| mozilla | 11.3 | 9.8  (-13%) |
| mozilla-gc | 11.84 | 10.36 (-12%) |
| mozilla-O0 | 8.2 | 5.84 (-28%) |
| scylla | 5.59 | 4.52 (-19%) |

Reviewed By: ruiu, MaskRay

Differential Revision: https://reviews.llvm.org/D69295
2019-11-11 22:14:28 -08:00
Fangrui Song 3c4f8bb108 AMDGPU/SI: make ~SIScheduleBlockCreator trivial 2019-11-11 21:51:59 -08:00
Fangrui Song 644de3b96e [PDB] Make pdb::DbiModuleDescriptor destructor trivial 2019-11-11 21:26:26 -08:00
Vasileios Porpodas 6a18a95487 [SLP] Look-ahead operand reordering heuristic.
Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for examples).

Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk

Reviewed By: RKSimon, dtemirbulatov

Subscribers: xbolva00, Carrot, hiraditya, phosek, rnk, rcorcs, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60897
2019-11-11 21:06:51 -08:00
Thomas Finch ac385ca63f Fix null dereference in yaml::Document::skip
Summary: The attached test case replicates a null dereference crash in
`yaml::Document::skip()`. This was fixed by adding a check and early
return in the method.

Reviewers: Bigcheese, hintonda, beanz

Reviewed By: hintonda

Subscribers: hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69974
2019-11-11 20:48:28 -08:00
Francesco Petrogalli e9a06e0606 [VFABI] Read/Write functions for the VFABI attribute.
The attribute is stored at the `FunctionIndex` attribute set, with the
name "vector-function-abi-variant".

The get/set methods of the attribute have assertion to verify that:

1. Each name in the attribute is a valid VFABI mangled name.

2. Each name in the attribute correspond to a function declared in the
   module.

Differential Revision: https://reviews.llvm.org/D69976
2019-11-12 03:40:42 +00:00
aqjune 4187cb138b Add InstCombine/InstructionSimplify support for Freeze Instruction
Summary:
- Add llvm::SimplifyFreezeInst
- Add InstCombiner::visitFreeze
- Add llvm tests

Reviewers: majnemer, sanjoy, reames, lebedev.ri, spatel

Reviewed By: reames, lebedev.ri

Subscribers: reames, lebedev.ri, filcab, regehr, trentxintong, llvm-commits

Differential Revision: https://reviews.llvm.org/D29013
2019-11-12 12:13:26 +09:00
Craig Topper 578f3b5dce [X86] Remove setOperationAction lines that say to promote MVT::i1
MVT::i1 should be removed by type legalization before we reach
any code that would act on the promote action.

Mainly to avoid replicating this for strict FP versions of these
operations.
2019-11-11 18:35:57 -08:00
Fangrui Song 2d0eb38d4c [MC] Make MCFragment trivially destructible 2019-11-11 18:11:15 -08:00
aqjune e87d71668e [IR] Redefine Freeze instruction
Summary:
This patch redefines freeze instruction from being UnaryOperator to a subclass of UnaryInstruction.

ConstantExpr freeze is removed, as discussed in the previous review.
FreezeOperator is not added because there's no ConstantExpr freeze.
`freeze i8* null` test is added to `test/Bindings/llvm-c/freeze.ll` as well, because the null pointer-related bug in `tools/llvm-c/echo.cpp` is now fixed.
InstVisitor has visitFreeze now because freeze is not unaryop anymore.

Reviewers: whitequark, deadalnix, craig.topper, jdoerfert, lebedev.ri

Reviewed By: craig.topper, lebedev.ri

Subscribers: regehr, nlopes, mehdi_amini, hiraditya, steven_wu, dexonsmith, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69932
2019-11-12 10:49:00 +09:00
Craig Topper 6c86d6efaf [X86] Remove some else branches after checking for !useSoftFloat() that set operations to Expand.
If we're using soft floats, then these operations shoudl be
softened during type legalization. They'll never get to
LegalizeVectorOps or LegalizeDAG so they don't need to be
Expanded there.
2019-11-11 16:32:19 -08:00
Sean Fertile e5e2e0a66b [PowerPC][XCOFF] Add support for zero initialized global values.
For XCOFF, globals mapped into the .bss section are linked as COMMON
definitions. This behaviour is incorrect for zero initialized data, so
emit those to the .data section instead.

Differential Revision: https://reviews.llvm.org/D69528
2019-11-11 18:52:10 -05:00
Victor Huang edab7dd426 Disable hoisting MI to hotter basic blocks
In current Hoist() function of machine licm pass, it will not check the source and destination basic block frequencies that a instruction is hoisted from/to.
There is a chance that instruction is hoisted from a cold to a hot basic block.

In this patch, we add options to disable machine instruction hoisting if destination block is hotter.

Differential Revision: https://reviews.llvm.org/D63676
2019-11-11 21:32:56 +00:00
Evandro Menezes c19528f180 [AArch64] Update for Exynos
Fix the costs of FP register moves.
2019-11-11 15:02:51 -06:00
Evandro Menezes 2eb9233034 [AArch64] Add new scheduling predicates
Add new scheduling predicates to identify more ASIMD forms.
2019-11-11 15:02:51 -06:00
Thomas Raoux e0f1d9d872 [ModuloSchedule] Fix modulo expansion for data loop carried dependencies.
The new experimental expansion has a problem when a value has a data
dependency with an instruction from a previous stage. This is due to
the way we peel out the kernel. To fix that I'm changing the way we
peel out the kernel. We now peel the kernel NumberStage - 1 times.
The code would be correct at this point if we didn't have to handle
cases where the loop iteration is smaller than the number of stages.
To handle this case we move instructions between different epilogues
based on their stage and remap the PHI instructions correctly.

Differential Revision: https://reviews.llvm.org/D69538
2019-11-11 12:09:27 -08:00
Simon Pilgrim 0e0dea8268 Add missing override modifiers for FileCheckExpressionAST::eval() overrides. 2019-11-11 18:51:46 +00:00
Simon Pilgrim 0d908e1252 Make FileCheckNumericVariable::getDefLineNumber const. NFC
Fixes cppcheck warning.
2019-11-11 18:51:45 +00:00
Thomas Raoux 03da6e8c00 [ModuloSchedule] Do target loop analysis before peeling.
Simple change to call target hook analyzeLoopForPipelining before
    changing the loop. After peeling analyzing the loop may be more
    complicated for target that don't have a loop instruction. This doesn't
    affect Hexagone and PPC as they have hardware loop instructions.

    Differential Revision: https://reviews.llvm.org/D69912
2019-11-11 09:35:39 -08:00
Yi-Hong Lyu 6bbfafd037 [CGP] Make ICMP_EQ use CR result of ICMP_S(L|G)T dominators
For example:

long long test(long long a, long long b) {
  if (a << b > 0)
    return b;
  if (a << b < 0)
    return a;
  return a*b;
}

Produces:

        sld. 5, 3, 4
        ble 0, .LBB0_2
        mr 3, 4
        blr
.LBB0_2:                                # %if.end
        cmpldi  5, 0
        li 5, 1
        isel 4, 4, 5, 2
        mulld 3, 4, 3
        blr

But the compare (cmpldi 5, 0) is redundant and can be removed (CR0 already
contains the result of that comparison).

The root cause of this is that LLVM converts signed comparisons into equality
comparison based on dominance. Equality comparisons are unsigned by default, so
we get either a record-form or cmp (without the l for logical) feeding a cmpl.
That is the situation we want to avoid here.

Differential Revision: https://reviews.llvm.org/D60506
2019-11-11 17:28:50 +00:00
Simon Pilgrim 5cfce5079b Timer - fix shadow variable warnings for Name/Description members. NFC. 2019-11-11 17:19:14 +00:00
Francis Visoiu Mistrih a9a3781df8 [ObjC] Override TailCallKind when lowering objc intrinsics
The tail-call-kind-ness is known by the ObjCARC analysis and can be
enforced while lowering the intrinsics to calls.

This allows us to get the requested tail calls at -O0 without trying to
preserve the attributes throughout passes that change code even at -O0
,like the Always Inliner, where the ObjCOpt pass doesn't run.

Differential Revision: https://reviews.llvm.org/D69980
2019-11-11 08:30:06 -08:00
Stefan Pintile fdf3d1766b [PowerPC] Implementing overflow version for XO-Form instructions
The Overflow version of XO-Form instruction uses the SO, OV and
OV32 special registers.

This changes modifies existing multiclasses and instruction
definitions to allow for the use of the XER register to record
the various types if overflow from possible add, subtract and
multiply instructions. It then modifies the existing instructions
as to use these multiclasses as needed.

Patch By: Kamau Bridgeman

Differential Revision: https://reviews.llvm.org/D66902
2019-11-11 09:50:46 -06:00
Sanjay Patel 29f5d1670c Revert "[InstCombine] avoid crash from deleting an instruction that still has uses (PR43723) (3rd try)"
This reverts commit 3db8a3ef86.
This caused a different memory-sanitizer failure than earlier attempts,
but it's still not right.
2019-11-11 09:56:03 -05:00
Sanjay Patel 3db8a3ef86 [InstCombine] avoid crash from deleting an instruction that still has uses (PR43723) (3rd try)
Re-try because earlier attempts were reverted due to use-after-free.
Hopefully, diagnosed correctly this time - we replace/remove the
invariant.start first rather than the invariant.end to avoid angering
worklist-based iteration.

We gather a set of white-listed instructions in isAllocSiteRemovable() and then
replace/erase them. But we don't know in general if the instructions in the set
have uses amongst themselves, so order of deletion makes a difference.

There's already a special-case for the llvm.objectsize intrinsic, so add another
for llvm.invariant.start.

Should fix:
https://bugs.llvm.org/show_bug.cgi?id=43723

Differential Revision: https://reviews.llvm.org/D69977
2019-11-11 09:29:40 -05:00
Tom Weaver 9f48a160dd Revert "[DBG][OPT] Attempt to salvage or undef debug info when removing trivially deletable instructions in the Reassociate Expression pass."
This reverts commit 1984a27db5.
2019-11-11 14:13:33 +00:00
Tom Weaver 1984a27db5 [DBG][OPT] Attempt to salvage or undef debug info when removing trivially deletable instructions in the Reassociate Expression pass.
Reviewed By: aprantl, vsk

Differential revision: https://reviews.llvm.org/D69943
2019-11-11 13:47:13 +00:00
Tom Weaver 75af15d81e [NFC][TEST_COMMIT] Add fullstop to comment. 2019-11-11 13:38:34 +00:00
Simon Pilgrim 0cc7c29a97 AArch64FunctionInfo - fix uninitialized variable warnings. NFCI. 2019-11-11 11:24:09 +00:00
Simon Pilgrim b47c7cd4d6 Fix -Wcovered-switch-default warning. NFCI. 2019-11-11 11:18:44 +00:00
Simon Pilgrim 0040c4ba1e Fix -Wparentheses warning. NFCI. 2019-11-11 11:13:32 +00:00
Simon Pilgrim 8383be0f75 Remove superfluous ';' to fix Wpedantic. NFC. 2019-11-11 10:54:00 +00:00
Jay Foad 9323ef4ecc [InstCombine] Simplify binary op when only one operand is a select
Summary:
SimplifySelectsFeedingBinaryOp simplified binary ops when both operands
were selects with the same condition. This patch extends it to handle
these cases where only one operand is a select:

X op (C ? P : Q) -> C ? (X op P) : (X op Q)
  // if X op P and X op Q both simplify
(C ? P : Q) op Y -> C ? (P op Y) : (Q op Y)
  // if P op Y and Q op Y both simplify

For example: X *fast (C ? 1.0 : 0.0) -> C ? X : 0.0

Reviewers: mcberg2017, majnemer, craig.topper, qcolombet, mcrosier

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64713
2019-11-11 10:01:59 +00:00
joanlluch e0012c5d6a [TargetLowering][DAGCombine][MSP430] Shift Amount Threshold in DAGCombine (3)
Summary:
Additional filtering of undesired shifts for targets that do not support them efficiently.

Related with  D69116 and  D69120

Applies the TLI.getShiftAmountThreshold hook to prevent undesired generation of shifts for the following IR code:

```
define i16 @testShiftBits(i16 %a) {
entry:
  %and = and i16 %a, -64
  %cmp = icmp eq i16 %and, 64
  %conv = zext i1 %cmp to i16
  ret i16 %conv
}

define i16 @testShiftBits_11(i16 %a) {
entry:
  %cmp = icmp ugt i16 %a, 63
  %conv = zext i1 %cmp to i16
  ret i16 %conv
}

define i16 @testShiftBits_12(i16 %a) {
entry:
  %cmp = icmp ult i16 %a, 64
  %conv = zext i1 %cmp to i16
  ret i16 %conv
}
```
The attached diff file shows the piece code in TargetLowering that is responsible for the generation of shifts in relation to the IR above.

Before applying this patch, shifts will be generated to replace non-legal icmp immediates. However, shifts may be undesired if they are even more expensive for the target.

For all my previous patches in this series (cited above) I added test cases for the MSP430 target. However, in this case, the target is not suitable for showing improvements related with this patch, because the MSP430 does not implement "isLegalICmpImmediate". The default implementation returns always true, therefore the patched code in TargetLowering is never reached for that target. Targets implementing both "isLegalICmpImmediate" and "getShiftAmountThreshold" will benefit from this.

The differential effect of this patch can only be shown for the MSP430 by temporarily implementing "isLegalICmpImmediate" to return false for large immediates. This is simulated with the implementation of a command line flag that was incorporated in D69975

This patch belongs to a initiative to "relax" the generation of shifts by LLVM for targets requiring it

Reviewers: spatel, lebedev.ri, asl

Reviewed By: spatel

Subscribers: lenary, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69326
2019-11-11 10:18:25 +01:00
Georgii Rymar 6b15c5dfac [FixBB] - Fix one more std::min -> std::min<uint64_t> to make BB happy.
BB: http://lab.llvm.org:8011/builders/clang-armv7-linux-build-cache/builds/22133/steps/build%20stage%201/logs/stdio
2019-11-11 12:11:54 +03:00
Matt Arsenault e6c9a9af39 Use MCRegister in copyPhysReg 2019-11-11 14:42:33 +05:30
Georgii Rymar a26d7b6298 [FixBB] - An attemp to fix clang-armv7-linux-build-cache builder.
http://lab.llvm.org:8011/builders/clang-armv7-linux-build-cache/builds/22130/steps/build%20stage%201/logs/stdio

/usr/bin/c++   -DGTEST_HAS_RTTI=0 -D_DEBUG -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -D_LARGEFILE_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/ObjectYAML -I/home/buildslave/buildslave/clang-armv7-linux-build-cache/llvm/llvm/lib/ObjectYAML -I/usr/include/libxml2 -Iinclude -I/home/buildslave/buildslave/clang-armv7-linux-build-cache/llvm/llvm/include -mthumb -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fdiagnostics-color -ffunction-sections -fdata-sections -O3    -UNDEBUG  -fno-exceptions -fno-rtti -std=c++14 -MMD -MT lib/ObjectYAML/CMakeFiles/LLVMObjectYAML.dir/YAML.cpp.o -MF lib/ObjectYAML/CMakeFiles/LLVMObjectYAML.dir/YAML.cpp.o.d -o lib/ObjectYAML/CMakeFiles/LLVMObjectYAML.dir/YAML.cpp.o -c /home/buildslave/buildslave/clang-armv7-linux-build-cache/llvm/llvm/lib/ObjectYAML/YAML.cpp
/home/buildslave/buildslave/clang-armv7-linux-build-cache/llvm/llvm/lib/ObjectYAML/YAML.cpp:42:41: error: no matching function for call to 'min'
    OS.write((const char *)Data.data(), std::min(N, Data.size()));
                                        ^~~~~~~~
/usr/bin/../lib/gcc/arm-linux-gnueabihf/5.4.0/../../../../include/c++/5.4.0/bits/algorithmfwd.h:370:5: note: candidate template ignored: deduced conflicting types for parameter '_Tp' ('unsigned long long' vs. 'unsigned int')
    min(const _Tp&, const _Tp&);
    ^
/usr/bin/../lib/gcc/arm-linux-gnueabihf/5.4.0/../../../../include/c++/5.4.0/bits/stl_algo.h:3451:5: note: candidate template ignored: could not match 'initializer_list<type-parameter-0-0>' against 'unsigned long long'
    min(initializer_list<_Tp> __l, _Compare __comp)
    ^
/usr/bin/../lib/gcc/arm-linux-gnueabihf/5.4.0/../../../../include/c++/5.4.0/bits/algorithmfwd.h:375:5: note: candidate function template not viable: requires 3 arguments, but 2 were provided
    min(const _Tp&, const _Tp&, _Compare);
    ^
/usr/bin/../lib/gcc/arm-linux-gnueabihf/5.4.0/../../../../include/c++/5.4.0/bits/stl_algo.h:3445:5: note: candidate function template not viable: requires single argument '__l', but 2 arguments were provided
    min(initializer_list<_Tp> __l)
    ^
/home/buildslave/buildslave/clang-armv7-linux-build-cache/llvm/llvm/lib/ObjectYAML/YAML.cpp:46:28: error: no matching function for call to 'min'
  for (uint64_t I = 0, E = std::min(N, Data.size() / 2); I != E; ++I) {
                           ^~~~~~~~
/usr/bin/../lib/gcc/arm-linux-gnueabihf/5.4.0/../../../../include/c++/5.4.0/bits/algorithmfwd.h:370:5: note: candidate template ignored: deduced conflicting types for parameter '_Tp' ('unsigned long long' vs. 'unsigned int')
    min(const _Tp&, const _Tp&);
    ^
/usr/bin/../lib/gcc/arm-linux-gnueabihf/5.4.0/../../../../include/c++/5.4.0/bits/stl_algo.h:3451:5: note: candidate template ignored: could not match 'initializer_list<type-parameter-0-0>' against 'unsigned long long'
    min(initializer_list<_Tp> __l, _Compare __comp)
    ^
/usr/bin/../lib/gcc/arm-linux-gnueabihf/5.4.0/../../../../include/c++/5.4.0/bits/algorithmfwd.h:375:5: note: candidate function template not viable: requires 3 arguments, but 2 were provided
    min(const _Tp&, const _Tp&, _Compare);
    ^
/usr/bin/../lib/gcc/arm-linux-gnueabihf/5.4.0/../../../../include/c++/5.4.0/bits/stl_algo.h:3445:5: note: candidate function template not viable: requires single argument '__l', but 2 arguments were provided
    min(initializer_list<_Tp> __l)

Fix: specify the type for std::min call.
2019-11-11 12:02:29 +03:00
Sander de Smalen 84a0c8e3ae [AArch64][SVE] Spilling/filling of SVE callee-saves.
Implement the spills/fills of callee-saved SVE registers using STR and LDR
instructions.

Also adds the `aarch64_sve_vector_pcs` attribute to specify the
callee-saved registers to be used for functions that return SVE vectors or
take SVE vectors as arguments. The callee-saved registers are vector
registers z8-z23 and predicate registers p4-p15.

The overal frame-layout with SVE will be as follows:

   +-------------+
   | stack args  |
   +-------------+
   | Callee Saves|
   |   X29, X30  |
   |-------------| <- FP
   | SVE Callee  | < //////////////
   | saved regs  | < //////////////
   |    z23      | < //////////////
   |     :       | < // SCALABLE //
   |    z8       | < //////////////
   |    p15      | < /// STACK ////
   |     :       | < //////////////
   |    p4       | < //// AREA ////
   +-------------+ < //////////////
   |     :       | < //////////////
   |  SVE locals | < //////////////
   |     :       | < //////////////
   +-------------+
   |/////////////| alignment gap.
   |     :       |
   | Stack objs  |
   |     :       |
   +-------------+ <- SP after call and frame-setup

Reviewers: cameron.mcinally, efriedma, greened, thegameg, ostannard, rengolin

Reviewed By: ostannard

Differential Revision: https://reviews.llvm.org/D68996
2019-11-11 09:03:19 +00:00
Georgii Rymar 06456daa9e [yaml2obj] - Add a way to describe the custom data that is not part of an output section.
Currently there is no way to describe the data that is not a part of an output section.
It can be a data used to align sections or to fill the gaps with something,
or another kind of custom data. In this patch I suggest a way to describe it. It looks like that:

```
Sections:
  - Type:    CustomFiller
    Pattern: "CCDD"
    Size:    4
  - Name:    .bar
    Type:    SHT_PROGBITS
    Content: "FF"
```

I.e. I've added a kind of synthetic section with a synthetic type "CustomFiller".
In the code it is called a "SyntheticFiller", which is "a synthetic section which
might be used to write the custom data around regular output sections. It does
not present in the sections header table, but it might affect the output file size and
program headers produced. Think about it as about piece of data."

`SyntheticFiller` currently has a `Pattern` field and a `Size` field + an optional `Name`.
When written, `Size` of bytes in the output will be filled with a `Pattern`.
It is possible to reference a named filler it by name from the program headers description,
just like any other normal section.

Differential revision: https://reviews.llvm.org/D69709
2019-11-11 11:48:23 +03:00
Craig Topper aafde063aa [InstCombine] Turn (extractelement <1 x i64/double> (bitcast (x86_mmx))) into a single bitcast from x86_mmx to i64/double.
The _m64 type is represented in IR as <1 x i64>. The x86-64 ABI
on Linux passes <1 x i64> as a double. MMX intrinsics use x86_mmx
type in IR.These things result in a lot of bitcasts in mmx code.
There's another instcombine that tries to turn bitcast <1 x i64>
to double into extractelement and a bitcast.

The combine here tries to reverse this extractelement conversion
if we see an mmx type.
2019-11-10 16:25:25 -08:00
Sanjay Patel d115b9fd4a Revert "[InstCombine] avoid crash from deleting an instruction that still has uses (PR43723) (2nd try)"
This reverts commit 56b2aee187.
Still causes a use-after-free on sanitizer bots.
2019-11-10 18:47:49 -05:00
Sanjay Patel 56b2aee187 [InstCombine] avoid crash from deleting an instruction that still has uses (PR43723) (2nd try)
Re-try rGef02831f0a4e (reverted due to use-after-free), but bail out completely
if we encounter an unexpected llvm.invariant.start.

We gather a set of white-listed instructions in isAllocSiteRemovable() and then
replace/erase them. But we don't know in general if the instructions in the set
have uses amongst themselves, so order of deletion makes a difference.

There's already a special-case for the llvm.objectsize intrinsic, so add another
for llvm.invariant.end.

Should fix:
https://bugs.llvm.org/show_bug.cgi?id=43723

Differential Revision: https://reviews.llvm.org/D69977
2019-11-10 17:26:36 -05:00
Stefan Stipanovic c250ebf7bc getArgOperandNo helper function.
Summary: A helper function to get argument number of a arg operand Use.

Reviewers: jdoerfert, uenoku

Subscribers: hiraditya, lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66844
2019-11-10 21:45:11 +01:00
Sanjay Patel b0ac26a632 Revert "[InstCombine] avoid crash from deleting an instruction that still has uses (PR43723)"
This reverts commit ef02831f0a.
Sanitizer bots fail with this change.
2019-11-10 11:18:05 -05:00
Luís Marques 1c737f54be [RISCV] Fix CFA when doing split sp adjustment with fp
Summary: When using the split sp adjustment and using the frame-pointer
we were still emitting CFI CFA directives based on the sp value. The
final sp-based offset also didn't reflect the two-stage sp adjust. There
remain CFI issues that aren't related to the split sp adjustment, and
thus will be addressed in a separate patch.

Reviewers: asb, lenary, shiva0217
Reviewed By: lenary, shiva0217
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69385
2019-11-10 16:09:14 +00:00
Luís Marques be0fead7bf [RISCV][NFC] Add CFI-related tests
Summary: Adds tests necessary to properly show the impact of other
patches that affect the emission of CFI directives.

Reviewers: asb, lenary
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69721
2019-11-10 16:00:07 +00:00
Sanjay Patel ef02831f0a [InstCombine] avoid crash from deleting an instruction that still has uses (PR43723)
We gather a set of white-listed instructions in isAllocSiteRemovable() and then
replace/erase them. But we don't know in general if the instructions in the set
have uses amongst themselves, so order of deletion makes a difference.

There's already a special-case for the llvm.objectsize intrinsic, so add another
for llvm.invariant.end.

Should fix:
https://bugs.llvm.org/show_bug.cgi?id=43723

Differential Revision: https://reviews.llvm.org/D69977
2019-11-10 09:18:11 -05:00
Simon Pilgrim 4ff246fef2 Remove unused variable (which allows us to remove vector include). NFC. 2019-11-10 12:16:23 +00:00
Simon Pilgrim 616a7f6ca0 TableGen - fix uninitialized variable warnings. NFCI. 2019-11-10 11:19:50 +00:00
Fangrui Song d890620fb2 [MC] Clean up MacroInstantiation. NFC 2019-11-09 23:27:15 -08:00
Tsang Whitney W.H 89453d186d [NFC]: Fix PVS Studio warning in LoopNestAnalysis
Summary:This patch fixes the following warnings uncovered by PVS
Studio:

/home/xbolva00/LLVM/llvm-project/llvm/lib/Analysis/LoopCacheAnalysis.cpp
353 warn V612 An unconditional 'return' within a loop.
/home/xbolva00/LLVM/llvm-project/llvm/lib/Analysis/LoopCacheAnalysis.cpp
456 err V502 Perhaps the '?:' operator works in a different way than it
was expected. The '?:' operator has a lower priority than the '=='
operator.
Authored By:etiotto
Reviewer:Meinersbur, kbarton, bmahjour, Whitney, xbolva00
Reviewed By:xbolva00
Subscribers:hiraditya, llvm-commits
Tag:LLVM
Differential Revision:https://reviews.llvm.org/D69821
2019-11-10 05:39:40 +00:00
Craig Topper c2751737e5 [X86] Handle MO_ConstantPoolIndex in X86AsmPrinter::PrintOperand
Fixes PR43952
2019-11-09 18:01:26 -08:00
Simon Pilgrim b0d0928241 YAMLParser - fix SimpleKey uninitialized variable warnings. NFCI. 2019-11-09 22:11:50 +00:00
Simon Pilgrim 6976a0e826 RegisterCoalescer - remove duplicate variable to fix Wshadow warning. NFCI. 2019-11-09 20:10:12 +00:00
Simon Pilgrim f092e80939 RegisterCoalescer - fix uninitialized variables. NFCI. 2019-11-09 20:10:11 +00:00
Gil Rapaport 7f152543e4 [LV] Apply sink-after & interleave-groups as VPlan transformations (NFCI)
This recommits 11ed1c0239 (reverted in
9f08ce0d21 for failing an assert) with a fix:
tryToWidenMemory() now first checks if the widening decision is to interleave,
thus maintaining previous behavior where tryToInterleaveMemory() was called
first, giving priority to interleave decisions over widening/scalarization. This
commit adds the test case that exposed this bug as a LIT.
2019-11-09 20:52:25 +02:00
Simon Pilgrim 3c37981bb3 Fix shadow variable warning with llvm::SrcMgr. NFCI. 2019-11-09 17:03:21 +00:00
Simon Pilgrim 7f8488eeb4 Fix operator precedence warning. NFC. 2019-11-09 17:03:21 +00:00
Simon Pilgrim 2fb9d72c77 Fix builds where LLVM_ENABLE_STATS is disabled
Missed Stats->EnableStats rename in rG3fb832fe8bdc317687d5a4d2ca20f5f73b089341
2019-11-09 13:47:53 +00:00
Simon Pilgrim 56a725ae5e Remarks - fix static analyzer warnings. NFCI.
- Fix uninitialized variable warnings.
 - Reuse BitstreamEntry iterator to avoid Wshadow warning.
 - Match declaration + definition arg names in BitstreamRemarkParser::processCommonMeta
 - Make BitstreamRemarkParser(StringRef) constructor explicit
2019-11-09 13:01:05 +00:00
Simon Pilgrim dda8015434 Remove duplicate MemVT to fix shadow variable warning. NFCI. 2019-11-09 13:01:04 +00:00
Simon Pilgrim 3fb832fe8b Statistic - Fix shadow variable warning. NFCI.
Rename option 'Stats' to 'EnableStats' and prevent clash with StatisticInfo::Stats member
2019-11-09 13:01:04 +00:00
Simon Pilgrim a35a44fd4b Remove superfluous break after return. NFC. 2019-11-09 13:01:03 +00:00
Simon Pilgrim 59a14f9d4b Fix shadow variable warning by reducing scope of CC/InverseCC CondCodes. NFCI. 2019-11-09 13:01:03 +00:00
Simon Pilgrim 0d5ad57ae3 Remarks - fix shadow variable warnings. NFCI.
Avoid conflict with llvm::remarks::Magic global variable.
2019-11-09 13:01:03 +00:00
Jay Foad d162e02cee Refactor SimplifySelectsFeedingBinaryOp for D64713. NFC. 2019-11-09 09:28:22 +00:00
Teresa Johnson b11391bb47 ThinLTO : Import always_inline functions irrespective of the threshold
Summary: A user can force a function to be inlined by specifying the always_inline attribute. Currently, thinlto implementation is not aware of always_inline functions and does not guarantee import of such functions, which in turn can prevent inlining of such functions.

Patch by Bharathi Seshadri <bseshadr@cisco.com>

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70014
2019-11-08 17:02:01 -08:00
David Blaikie db797bfb2b DebugInfo: Remove redundant conditionals/checks from macro info emission
These checks fall out naturally from the current implementation without
needing to be explicitly considered anymore.
2019-11-08 15:31:15 -08:00
David Blaikie 736273c7fe DebugInfo: Do not create a debug_macinfo section if no CUs have associated macros
Patch based on Sourabh Singh's D69839 patch.
2019-11-08 15:30:11 -08:00
David Blaikie 3951245c38 NVPTX: Don't insert an extra empty line at the end of the last section.
This was arbitrarily appearing in only the last section emitted - which
made tests more sensitive than they needed to be (removing the last
section - like the macinfo section change that's coming after this)
would, surprisingly, move the blank line to the previous section.
2019-11-08 15:16:04 -08:00
Fangrui Song 8f089f2099 [MC] Emit unused undefined symbol even if its binding is not set
Recommit r373168, which was reverted by r373242. This actually exposed a
boringssl bug which has been fixed for more than one month.

For the following two cases, we currently suppress the symbols. This
patch emits them (compatible with GNU as).

* `test2_a = undef`: if `undef` is otherwise unused.
* `.hidden hidden`: if `hidden` is unused. This is the main point of the
  patch, because omitting the symbol would cause a linker semantic
  difference.

It causes a behavior change that is not compatible with GNU as:

.weakref foo1, bar1

When neither foo1 nor bar1 is used, we now emit bar1, which is arguably
more consistent.

Another change is that we will emit .TOC. for .TOC.@tocbase .  For this
directive, suppressing .TOC. can be seen as a size optimization, but we
choose to drop it for simplicity and consistency.
2019-11-08 14:47:48 -08:00
joanlluch fe0763d28a [TargetLowering][DAGCombine][MSP430] Shift Amount Threshold in DAGCombine (3) (baseline tests)
Summary:
This is baseline tests for D69326

Incorporates a command line flag for the MSP430 and adds a test cases to help showing the effects of applying D69326

More details and motivation for this patch in D69326

Reviewers: spatel, asl, lebedev.ri

Reviewed By: spatel, asl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69975
2019-11-08 23:16:44 +01:00
David Blaikie 39c308f6b8 DebugInfo: Use separate macinfo contributions for each CU
The macinfo support was broken for LTO situations, by terminating
macinfo lists only once - multiple macinfo contributions were correctly
labeled, but they all continued/flowed into later contributions until
only one terminator appeared at the end of the section.

Correctly terminate each contribution & fix the parsing to handle this
situation too. The parsing fix is also necessary for dumping linked
binaries - the previous code would stop at the end of the first
contribution - missing all later contributions in a linked binary.

It'd be nice to improve the dumping to print the offsets of each
contribution so it'd be easier to know which CU AT_macro_info refers to
which macinfo contribution.
2019-11-08 13:27:00 -08:00
bmahjour f0af11d86f [DDG] Data Dependence Graph - Pi Block
Summary:
    This patch adds Pi Blocks to the DDG. A pi-block represents a group of DDG
    nodes that are part of a strongly-connected component of the graph.
    Replacing all the SCCs with pi-blocks results in an acyclic representation
    of the DDG. For example if we have:
       {a -> b}, {b -> c, d}, {c -> a}
    the cycle a -> b -> c -> a is abstracted into a pi-block "p" as follows:
       {p -> d} with "p" containing: {a -> b}, {b -> c}, {c -> a}
    In this implementation the edges between nodes that are part of the pi-block
    are preserved. The crossing edges (edges where one end of the edge is in the
    set of nodes belonging to an SCC and the other end is outside that set) are
    replaced with corresponding edges to/from the pi-block node instead.

    Authored By: bmahjour

    Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert

    Reviewed By: Meinersbur

    Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto, ppc-slack

    Tag: #llvm

    Differential Revision: https://reviews.llvm.org/D68827
2019-11-08 15:46:08 -05:00
Eli Friedman 5df3a87224 [AArch64][X86] Don't assume __powidf2 is available on Windows.
We had some code for this for 32-bit ARM, but this doesn't really need
to be in target-specific code; generalize it.

(I think this started showing up recently because we added an
optimization that converts pow to powi.)

Differential Revision: https://reviews.llvm.org/D69013
2019-11-08 12:43:21 -08:00
Gil Rapaport 9f08ce0d21 Revert "[LV] Apply sink-after & interleave-groups as VPlan transformations (NFCI)"
This reverts commit 11ed1c0239 - causes an assert failure.
2019-11-08 22:17:11 +02:00
Nikita Popov 885a05f48a Reapply [LVI] Normalize pointer behavior
Fix cache invalidation by not guarding the dereferenced pointer cache
erasure by SeenBlocks. SeenBlocks is only populated when actually
caching a value in the block, which doesn't necessarily have to happen
just because dereferenced pointers were calculated.

-----

Related to D69686. As noted there, LVI currently behaves differently
for integer and pointer values: For integers, the block value is always
valid inside the basic block, while for pointers it is only valid at
the end of the basic block. I believe the integer behavior is the
correct one, and CVP relies on it via its getConstantRange() uses.

The reason for the special pointer behavior is that LVI checks whether
a pointer is dereferenced in a given basic block and marks it as
non-null in that case. Of course, this information is valid only after
the dereferencing instruction, or in conservative approximation,
at the end of the block.

This patch changes the treatment of dereferencability: Instead of
including it inside the block value, we instead treat it as something
similar to an assume (it essentially is a non-nullness assume) and
incorporate this information in intersectAssumeOrGuardBlockValueConstantRange()
if the context instruction is the terminator of the basic block.
This happens either when determining an edge-value internally in LVI,
or when a terminator was explicitly passed to getValueAt(). The latter
case makes this change not fully NFC, because we can now fold
terminator icmps based on the dereferencability information in the
same block. This is the reason why I changed one JumpThreading test
(it would optimize the condition away without the change).

Of course, we do not want to recompute dereferencability on each
intersectAssume call, so we need a new cache for this. The
dereferencability analysis requires walking the entire basic block
and computing underlying objects of all memory operands. This was
previously done separately for each queried pointer value. In the
new implementation (both because this makes the caching simpler,
and because it is faster), I instead only walk the full BB once and
cache all the dereferenced pointers. So the traversal is now performed
only once per BB, instead of once per queried pointer value.

I think the overall model now makes more sense than before, and there
will be no more pitfalls due to differing integer/pointer behavior.

Differential Revision: https://reviews.llvm.org/D69914
2019-11-08 20:13:55 +01:00
Jan Korous 590f279c45 [clang] Add VFS support for sanitizers' blacklists
Differential Revision: https://reviews.llvm.org/D69648
2019-11-08 10:58:50 -08:00
evgeny 7f92d66f37 [ThinLTO] Fix bug when importing writeonly variables
Patch enables import of write-only variables with non-trivial initializers
to fix linker errors. Initializers of imported variables are converted to
'zeroinitializer' to avoid promotion of referenced objects.

Differential revision: https://reviews.llvm.org/D70006
2019-11-08 20:50:34 +03:00
Kazu Hirata 9aff5e1c18 [JumpThreading] Fix a comment typo (NFC)
Reviewers: kazu

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70013
2019-11-08 09:29:46 -08:00
Nikita Popov 43ae5f4386 Revert "[LVI] Normalize pointer behavior"
This reverts commit 15bc4dc9a8.

clang-cmake-x86_64-sde-avx512-linux buildbot reported quite a few
compile-time regressions in test-suite, will investigate.
2019-11-08 18:22:34 +01:00
Nikita Popov 15bc4dc9a8 [LVI] Normalize pointer behavior
Related to D69686. As noted there, LVI currently behaves differently
for integer and pointer values: For integers, the block value is always
valid inside the basic block, while for pointers it is only valid at
the end of the basic block. I believe the integer behavior is the
correct one, and CVP relies on it via its getConstantRange() uses.

The reason for the special pointer behavior is that LVI checks whether
a pointer is dereferenced in a given basic block and marks it as
non-null in that case. Of course, this information is valid only after
the dereferencing instruction, or in conservative approximation,
at the end of the block.

This patch changes the treatment of dereferencability: Instead of
including it inside the block value, we instead treat it as something
similar to an assume (it essentially is a non-nullness assume) and
incorporate this information in intersectAssumeOrGuardBlockValueConstantRange()
if the context instruction is the terminator of the basic block.
This happens either when determining an edge-value internally in LVI,
or when a terminator was explicitly passed to getValueAt(). The latter
case makes this change not fully NFC, because we can now fold
terminator icmps based on the dereferencability information in the
same block. This is the reason why I changed one JumpThreading test
(it would optimize the condition away without the change).

Of course, we do not want to recompute dereferencability on each
intersectAssume call, so we need a new cache for this. The
dereferencability analysis requires walking the entire basic block
and computing underlying objects of all memory operands. This was
previously done separately for each queried pointer value. In the
new implementation (both because this makes the caching simpler,
and because it is faster), I instead only walk the full BB once and
cache all the dereferenced pointers. So the traversal is now performed
only once per BB, instead of once per queried pointer value.

I think the overall model now makes more sense than before, and there
will be no more pitfalls due to differing integer/pointer behavior.

Differential Revision: https://reviews.llvm.org/D69914
2019-11-08 17:57:14 +01:00