Commit Graph

41887 Commits

Author SHA1 Message Date
Evgenii Stepanov f2c0423995 [msan] Remove readnone and friends from call sites.
MSan removes readnone/readonly and similar attributes from callees,
because after MSan instrumentation those attributes no longer apply.

This change removes the attributes from call sites, as well.

Failing to do this may cause DSE of paramTLS stores before calls to
readonly/readnone functions.

Differential Revision: https://reviews.llvm.org/D85259
2020-08-05 10:34:45 -07:00
Jordan Rupprecht 3c39db0c44 Revert "[LoopVectorizer] Inloop vector reductions"
This reverts commit e9761688e4. It breaks the build:

```
~/src/llvm-project/llvm/lib/Analysis/IVDescriptors.cpp:868:10: error: no viable conversion from returned value of type 'SmallVector<[...], 8>' to function return type 'SmallVector<[...], 4>'
  return ReductionOperations;
```
2020-08-05 10:24:15 -07:00
Mircea Trofin b18c41c66f [TFUtils] Expose untyped accessor to evaluation result tensors
These were implementation detail, but become necessary for generic data
copying.

Also added const variations to them, and move assignment, since we had a
move ctor (and the move assignment helps in a subsequent patch).

Differential Revision: https://reviews.llvm.org/D85262
2020-08-05 10:22:45 -07:00
David Green e9761688e4 [LoopVectorizer] Inloop vector reductions
Arm MVE has multiple instructions such as VMLAVA.s8, which (in this
case) can take two 128bit vectors, sign extend the inputs to i32,
multiplying them together and sum the result into a 32bit general
purpose register. So taking 16 i8's as inputs, they can multiply and
accumulate the result into a single i32 without any rounding/truncating
along the way. There are also reduction instructions for plain integer
add and min/max, and operations that sum into a pair of 32bit registers
together treated as a 64bit integer (even though MVE does not have a
plain 64bit addition instruction). So giving the vectorizer the ability
to use these instructions both enables us to vectorize at higher
bitwidths, and to vectorize things we previously could not.

In order to do that we need a way to represent that the reduction
operation, specified with a llvm.experimental.vector.reduce when
vectorizing for Arm, occurs inside the loop not after it like most
reductions. This patch attempts to do that, teaching the vectorizer
about in-loop reductions. It does this through a vplan recipe
representing the reductions that the original chain of reduction
operations is replaced by. Cost modelling is currently just done through
a prefersInloopReduction TTI hook (which follows in a later patch).

Differential Revision: https://reviews.llvm.org/D75069
2020-08-05 18:14:05 +01:00
Georgii Rymar 6ae5b9e405 [llvm-readobj] - Make decode_relrs() don't return Expected<>. NFCI.
The `decode_relrs` helper is declared as:

`Expected<std::vector<Elf_Rel>> decode_relrs(Elf_Relr_Range relrs) const;`

it never returns an error though and hence can be simplified to return
a vector.

Differential revision: https://reviews.llvm.org/D85302
2020-08-05 17:05:47 +03:00
Simon Pilgrim 7b993903e0 DWARFVerifier.h - remove unnecessary forward declarations and includes. NFCI. 2020-08-05 12:42:44 +01:00
Simon Pilgrim 315e1daf7f GISelWorkList.h - remove unnecessary includes. NFCI. 2020-08-05 12:00:28 +01:00
Simon Pilgrim e3d3657b9b CallLowering.h - remove unnecessary CCState forward declaration. NFCI.
Already defined in CallingConvLower.h
2020-08-05 12:00:28 +01:00
Hans Wennborg 3ab01550b6 Revert "[CMake] Simplify CMake handling for zlib"
This quietly disabled use of zlib on Windows even when building with
-DLLVM_ENABLE_ZLIB=FORCE_ON.

> Rather than handling zlib handling manually, use find_package from CMake
> to find zlib properly. Use this to normalize the LLVM_ENABLE_ZLIB,
> HAVE_ZLIB, HAVE_ZLIB_H. Furthermore, require zlib if LLVM_ENABLE_ZLIB is
> set to YES, which requires the distributor to explicitly select whether
> zlib is enabled or not. This simplifies the CMake handling and usage in
> the rest of the tooling.
>
> This is a reland of abb0075 with all followup changes and fixes that
> should address issues that were reported in PR44780.
>
> Differential Revision: https://reviews.llvm.org/D79219

This reverts commit 10b1b4a231 and follow-ups
64d99cc6ab and
f9fec0447e.
2020-08-05 12:31:44 +02:00
Georgii Rymar f97019ad6e [llvm-readobj/elf] - Add a testing for --stackmap and refine the implementation.
Currently, we only test the `--stackmap` option here:
https://github.com/llvm/llvm-project/blob/master/llvm/test/Object/stackmap-dump.test
it uses a precompiled MachO binary currently and I've found no tests for this option for ELF.

The implementation also has issues. For example, it might assert on a wrong version
of the .llvm-stackmaps section. Or it might crash on an empty or truncated section.

This patch introduces a new tools/llvm-readobj/ELF test file as well as implements a few
basic checks to catch simple crashes/issues

It also eliminates `unwrapOrError` calls in `printStackMap()`.

Differential revision: https://reviews.llvm.org/D85208
2020-08-05 13:09:04 +03:00
Yevgeny Rouban bc10888dcd DomTree: Make PostDomTree indifferent to block successors swap
Fixed the commit c35585e209.

This is a fix for the bug 46098 where PostDominatorTree
is unexpectedly changed by InstCombine's branch swapping
transformation.
This patch fixes PostDomTree builder. While looking for
the furthest away node in a reverse unreachable subgraph
this patch runs DFS with successors in their function order.
This order is indifferent to the order of successors, so is
the furthest away node.

Reviewers: kuhar, nikic, lebedev.ri
Differential Revision: https://reviews.llvm.org/D84763
2020-08-05 14:26:32 +07:00
Mehdi Amini 1366d66a22 Revert "DomTree: Make PostDomTree immune to block successors swap"
This reverts commit c35585e209.

The MLIR is broken with this patch, reproduce by adding
-DLLVM_ENABLE_PROJECTS=mlir to the cmake configuration and
build `ninja tools/mlir/lib/IR/CMakeFiles/obj.MLIRIR.dir/Dominance.cpp.o`
2020-08-05 04:32:44 +00:00
Evgeniy Brevnov 02a629daad [BPI][NFC] Unify handling of normal and SCC based loops
This is one more NFC part extracted from D79485. Normal and SCC based loops have very different representation and have to be handled separatly each time we deal with loops. D79485 is going to introduce much more extensive use of loops what will be problematic with out this change.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D84838
2020-08-05 11:19:24 +07:00
Yevgeny Rouban c35585e209 DomTree: Make PostDomTree immune to block successors swap
This is another fix for the bug 46098 where PostDominatorTree
is unexpectedly changed by InstCombine's branch swapping
transformation.
This patch fixes PostDomTree builder. While looking for
the furthest away node in a reverse unreachable subgraph
this patch runs DFS with successors in their function order.
This order is indifferent to the order of successors, so is
the furthest away node.

Reviewers: kuhar, nikic, lebedev.ri
Differential Revision: https://reviews.llvm.org/D84763
2020-08-05 11:06:54 +07:00
Matt Arsenault 54615ec48f GlobalISel: Move load/store lowering to separate functions 2020-08-04 22:03:51 -04:00
Mircea Trofin 90b9c49ca6 [llvm] Expose type and element count-related APIs on TensorSpec
Added a mechanism to check the element type, get the total element
count, and the size of an element.

Differential Revision: https://reviews.llvm.org/D85250
2020-08-04 17:32:16 -07:00
Krzysztof Parzyszek 06d425737b [RDF] Add operator<<(raw_ostream&, RegisterAggr), NFC 2020-08-04 18:40:07 -05:00
Krzysztof Parzyszek 9521704553 [RDF] Use hash-based containers, cache extra information
This improves performance.
2020-08-04 18:36:49 -05:00
Yonghong Song 00602ee7ef BPF: simplify IR generation for __builtin_btf_type_id()
This patch simplified IR generation for __builtin_btf_type_id().
For __builtin_btf_type_id(obj, flag), previously IR builtin
looks like
   if (obj is a lvalue)
     llvm.bpf.btf.type.id(obj.ptr, 1, flag)  !type
   else
     llvm.bpf.btf.type.id(obj, 0, flag)  !type
The purpose of the 2nd argument is to differentiate
   __builtin_btf_type_id(obj, flag) where obj is a lvalue
vs.
   __builtin_btf_type_id(obj.ptr, flag)

Note that obj or obj.ptr is never used by the backend
and the `obj` argument is only used to derive the type.
This code sequence is subject to potential llvm CSE when
  - obj is the same .e.g., nullptr
  - flag is the same
  - metadata type is different, e.g., typedef of struct "s"
    and strust "s".
In the above, we don't want CSE since their metadata is different.

This patch change IR builtin to
   llvm.bpf.btf.type.id(seq_num, flag)  !type
and seq_num is always increasing. This will prevent potential
llvm CSE.

Also report an error if the type name is empty for
remote relocation since remote relocation needs non-empty
type name to do relocation against vmlinux.

Differential Revision: https://reviews.llvm.org/D85174
2020-08-04 16:29:42 -07:00
Krzysztof Parzyszek f0f467aeec [RDF] Cache register aliases in PhysicalRegisterInfo
This improves performance of PhysicalRegisterInfo::makeRegRef.
2020-08-04 18:10:00 -05:00
Arthur Eubanks f50b3ff02e [Hexagon] Use InstSimplify instead of ConstantProp
This is the last remaining use of ConstantProp, migrate it to InstSimplify in the goal of removing ConstantProp.

Add -hexagon-instsimplify option to enable skipping of instsimplify in
tests that can't handle the extra optimization.

Differential Revision: https://reviews.llvm.org/D85047
2020-08-04 15:42:39 -07:00
Krzysztof Parzyszek 09897b146a [RDF] Remove uses of RDFRegisters::normalize (deprecate)
This function has been reduced to an identity function for some time.
2020-08-04 17:02:12 -05:00
Matt Arsenault f8fb7835d6 GlobalISel: Add utilty for getting function argument live ins
Get the argument register and ensure there's a copy to the virtual
register. AMDGPU and AArch64 have similarish code to get the livein
value, and I also want to use this in multiple places.

This is a bit more aggressive about setting the register class than
the original function, but that's probably OK.

I think we're missing a few verifier checks for function live ins. I
noticed AArch64's calling convention code is not actually adding
liveins to functions, only the entry block (which apparently might not
matter that much?). There should probably be a verifier check that
entry block live ins are also live into the function. We also might
need a verifier check that the copy to the livein virtual register is
in the entry block.
2020-08-04 16:55:55 -04:00
Matt Arsenault f2942f9c26 GlobalISel: Add node mappings for frameindex/blockaddress 2020-08-04 15:13:49 -04:00
Nikita Popov 4564974504 [SCCP] Propagate inequalities
Teach SCCP to create notconstant lattice values from inequality
assumes and nonnull metadata, and update getConstant() to make
use of them. Additionally isOverdefined() needs to be changed to
consider notconstant an overdefined value.

Handling inequality branches is delayed until our branch on undef
story in other passes has been improved.

Differential Revision: https://reviews.llvm.org/D83643
2020-08-04 20:20:52 +02:00
Xing GUO 12605bfd1f [DWARFYAML] Fix unintialized value Is64BitAddrSize. NFC.
This patch fixes the undefined behavior that reported by ubsan.

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/44524/
2020-08-05 00:28:17 +08:00
Cameron McInally 23adbac9ee [GlobalISel] Don't transform FSUB(-0, X) -> FNEG(X) in GlobalISel.
This patch stops unconditionally transforming FSUB(-0, X) into an FNEG(X) while building the MIR.

This corresponds with the SelectionDAGISel change in D84056.

Differential Revision: https://reviews.llvm.org/D85139
2020-08-04 11:27:09 -05:00
Fangrui Song 593e196297 [llvm-symbolizer] Switch command line parsing from llvm::cl to OptTable
for the advantage outlined by D83639 ([OptTable] Support grouped short options)

Some behavior changes:

* -i={0,false} is removed. Use --no-inlines instead.
* --demangle={0,false} is removed. Use --no-demangle instead
* -untag-addresses={0,false} is removed. Use --no-untag-addresses instead

Added a higher level API OptTable::parseArgs which handles optional
initial options populated from an environment variable, expands response
files recursively, and parses options.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D83530
2020-08-04 08:53:15 -07:00
Yonghong Song 6d67506964 [clang][BPF] support type exist/size and enum exist/value relocations
This patch added the following additional compile-once
run-everywhere (CO-RE) relocations:
  - existence/size of typedef, struct/union or enum type
  - enum value and enum value existence

These additional relocations will make CO-RE bpf programs more
adaptive for potential kernel internal data structure changes.

For existence/size relocations, the following two code patterns
are supported:
  1. uint32_t __builtin_preserve_type_info(*(<type> *)0, flag);
  2. <type> var;
     uint32_t __builtin_preserve_field_info(var, flag);
flag = 0 for existence relocation and flag = 1 for size relocation.

For enum value existence and enum value relocations, the following code
pattern is supported:
  uint64_t __builtin_preserve_enum_value(*(<enum_type> *)<enum_value>,
                                         flag);
flag = 0 means existence relocation and flag = 1 for enum value.
relocation. In the above <enum_type> can be an enum type or
a typedef to enum type. The <enum_value> needs to be an enumerator
value from the same enum type. The return type is uint64_t to
permit potential 64bit enumerator values.

Differential Revision: https://reviews.llvm.org/D83242
2020-08-04 08:39:53 -07:00
Sander de Smalen fd6584a220 [AArch64][SVE] Fix CFA calculation in presence of SVE objects.
The CFA is calculated as (SP/FP + offset), but when there are
SVE objects on the stack the SP offset is partly scalable and
should instead be expressed as the DWARF expression:

     SP + offset + scalable_offset * VG

where VG is the Vector Granule register, containing the
number of 64bits 'granules' in a scalable vector.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D84043
2020-08-04 11:47:06 +01:00
Xing GUO 79b44a4d47 [YAMLTraits] Fix mapping <none> value that followed by comments.
When mapping an optional value, if the value is <none> and followed
by comments, there will be a parsing error. This patch helps fix this
issue.

e.g.,

When mapping the following YAML,

```
Sections:
  - Name:  blah
    Type:  SHT_foo
    Flags: [[FLAGS=<none>]] ## some comments.
```

the raw value of `ScalarNode` is "<none> " rather than "<none>". We need
to remove the spaces.

Differential Revision: https://reviews.llvm.org/D85180
2020-08-04 18:36:05 +08:00
Xing GUO 25abd1994e [YAMLParser] Fix a typo: iff -> if. NFC. 2020-08-04 12:42:42 +08:00
hgreving 509f5c4ec2 [MC] Fix memory leak when allocating MCInst with bump allocator
Adds the function createMCInst() to MCContext that creates a MCInst using
a typed bump alloctor.

MCInst contains a SmallVector<MCOperand, 8>. The SmallVector is POD only
for <= 8 operands. The default untyped bump pointer allocator of MCContext
does not delete the MCInst, so if the SmallVector grows, it's a leak.

This fixes https://bugs.llvm.org/show_bug.cgi?id=46900.
2020-08-03 16:08:26 -07:00
Alina Sbirlea 1ce82015f6 [MemorySSA] Restrict optimizations after a PhiTranslation.
Merging alias results from different paths, when a path did phi
translation is not necesarily correct. Conservatively terminate such paths.
Aimed to fix PR46156.

Differential Revision: https://reviews.llvm.org/D84905
2020-08-03 14:46:41 -07:00
Thomas Lively cb32792210 [WebAssembly] Implement prototype v128.load{32,64}_zero instructions
Specified in https://github.com/WebAssembly/simd/pull/237, these
instructions load the first vector lane from memory and zero the other
lanes. Since these instructions are not officially part of the SIMD
proposal, they are only available on an opt-in basis via LLVM
intrinsics and clang builtin functions. If these instructions are
merged to the proposal, this implementation will change so that the
instructions will be generated from normal IR. At that point the
intrinsics and builtin functions would be removed.

This PR also changes the opcodes for the experimental f32x4.qfm{a,s}
instructions because their opcodes conflicted with those of the
v128.load{32,64}_zero instructions. The new opcodes were chosen to
match those used in V8.

Differential Revision: https://reviews.llvm.org/D84820
2020-08-03 13:54:00 -07:00
Jian Cai c6334db577 [X86] support .nops directive
Add support of .nops on X86. This addresses llvm.org/PR45788.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D82826
2020-08-03 11:50:56 -07:00
Hiroshi Yamauchi f78f509c75 [PGO] Extend the value profile buckets for mem op sizes.
Extend the memop value profile buckets to be more flexible (could accommodate a
mix of individual values and ranges) and to cover more value ranges (from 11 to
22 buckets).

Disabled behind a flag (to be enabled separately) and the existing code to be
removed later.

Differential Revision: https://reviews.llvm.org/D81682
2020-08-03 11:04:32 -07:00
Arthur Eubanks 7c19c89dd5 [NewPM][LoopVersioning] Port LoopVersioning to NPM
Reviewed By: ychen, fhahn

Differential Revision: https://reviews.llvm.org/D85063
2020-08-03 10:32:09 -07:00
Kevin P. Neal d535a91d13 [FPEnv] IRBuilder fails to add strictfp attribute
The strictfp attribute is required on all function calls in a function
that is itself marked with the strictfp attribute. The IRBuilder knows
this and has a method for adding the attribute to function call instructions.

If a function being called has the strictfp attribute itself then the
IRBuilder will refuse to add the attribute to the calling instruction
despite being asked to add it. Eliminate this error.

Differential Revision: https://reviews.llvm.org/D84878
2020-08-03 13:25:24 -04:00
Mircea Trofin 4b1b109c51 [llvm] Add a parser from JSON to TensorSpec
A JSON->TensorSpec utility we will use subsequently to specify
additional outputs needed for certain training scenarios.

Differential Revision: https://reviews.llvm.org/D84976
2020-08-03 09:49:31 -07:00
Xing GUO 08649d4321 [DWARFYAML] Implement the .debug_loclists section.
This patch implements the .debug_loclists section. There are only two
DWARF expressions are implemented in this patch (DW_OP_consts,
DW_OP_stack_value). We will implement more in the future.

The YAML description of the .debug_loclists section is:

```
debug_loclists:
  - Format:              DWARF32 ## Optional
    Length:              0x1234  ## Optional
    Version:             5       ## Optional (5 by default)
    AddressSize:         8       ## Optional
    SegmentSelectorSize: 0       ## Optional (0 by default)
    OffsetEntryCount:    1       ## Optional
    Offsets:             [ 1 ]   ## Optional
    Lists:
      - Entries:
          - Operator:          DW_LLE_startx_endx
            Values:            [ 0x1234, 0x4321 ]
            DescriptorsLength: 0x1234             ## Optional
            Descriptors:
              - Operator: DW_OP_consts
                Values:   [ 0x1234 ]
```

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D84234
2020-08-03 23:20:15 +08:00
Shinji Okumura 1c2777f585 [NFC][APInt][DenseMapInfo] Move DenseMapAPIntKeyInfo into DenseMap.h as DenseMapInfo<APInt>
`DenseMapAPIntKeyInfo` is now located in `lib/IR/LLVMContextImpl.h`.
Moved it into `include/ADT/DenseMapInfo.h` to use it.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D85131
2020-08-03 23:31:13 +09:00
Matt Arsenault 1782fbbc69 GlobalISel: Reimplement moreElementsVectorDst
Use pad with undef and unmerge with unused results. This is annoyingly
similar to several other places in LegalizerHelper, but they're all
slightly different.
2020-08-03 09:03:48 -04:00
Georgii Rymar d919ae9df8 [yaml2obj] - Add a support for "<none>" value for all optional fields.
It implements an approach suggested in the D84398 thread.

With it the following:

```
Sections:
  - Name:   .bar
    Type:   SHT_PROGBITS
    Offset: [[MACRO=<none>]]
```

works just like the `Offset` key was not specified.
It is useful for tests that want to have a default value for a field and to
have a way to override it at the same time.

Differential revision: https://reviews.llvm.org/D84526
2020-08-03 12:27:39 +03:00
Djordje Todorovic 4fdc4d892b [NFC] [MIR] Document the reg state flags
This patch adds documentation for the RegState enumeration.

Differential Revision: https://reviews.llvm.org/D84634
2020-08-03 09:03:24 +02:00
Fangrui Song 40da58a04b [MC] Default MCAsmBackend::mayNeedRelaxation() to false 2020-08-02 22:13:59 -07:00
Florian Hahn 599955eb56 Recommit "[IPConstProp] Remove and move tests to SCCP."
This reverts commit 59d6e814ce.

The cause for the revert (3 clang tests running opt -ipconstprop) was
fixed by removing those lines.
2020-08-02 22:23:54 +01:00
Simon Pilgrim e202236721 [IR] Add IRBuilderBase::CreateVectorSplat(ElementCount EC) variant
As discussed on D81500, this adds a more general ElementCount variant of the build helper and converts the (non-scalable) unsigned NumElts variant to use it internally.
2020-08-02 16:55:38 +01:00
Matt Arsenault 212570abcf GlobalISel: Implement bitcast action for G_EXTRACT_VECTOR_ELEMENT
For AMDGPU, vectors with elements < 32 bits should be indexed in
32-bit elements and the desired bits extracted from there. For
elements > 64-bits, these should be reduce to 64/32 elements to enable
the normal dynamic indexing paths.

In the dynamic index cases, this produces shorter code most of the
time. This does immediately regress the constant index cases, but this
should be fixed once we have the most basic of shift combines.

The element size > 64 case is pretty much ported from the exisiting
DAG implementation for extract element promote. The increasing element
size case is new.
2020-08-02 10:42:07 -04:00
Simon Pilgrim b8ffbf0e02 [DAG] TargetLowering::expandMUL_LOHI - pass SDLoc as const&
Try to be more consistent with the SDLoc param in the TargetLowering methods.

This also exposes an issue where we were passing a SDNode as a SDLoc, relying on the implicit SDLoc(SDNode) constructor.
2020-08-02 15:31:36 +01:00