Commit Graph

42501 Commits

Author SHA1 Message Date
Eric Astor 0548d1ca24 [ms] [llvm-ml] Add support for "alias" directive
Support the "alias" directive.

Required support for emitWeakReference in MCWinCOFFStreamer.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D87403
2020-09-29 16:59:50 -04:00
Eric Astor fdd23a3542 [ms] [llvm-ml] Add REAL10 support (x87 extended precision)
Add MASM support for 80-bit reals in the x87 extended precision format.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D87402
2020-09-29 16:58:46 -04:00
Eric Astor c65e9e71eb [ms] [llvm-ml] Add MASM hex float support
Implement MASM's syntax for specifying floats in raw hexadecimal bytes.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D87401
2020-09-29 16:57:32 -04:00
Eric Astor 6b70a83d9c [ms] [llvm-ml] Add support for .radix directive, and accept all radix specifiers
Add support for .radix directive, and radix specifiers [yY] (binary), [oOqQ] (octal), and [tT] (decimal).

Also, when lexing MASM integers, require radix specifier; MASM requires that all literals without a radix specifier be treated as in the default radix. (e.g., 0100 = 100)

Relanding D87400, now with fewer ms-inline-asm tests broken!

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D88337
2020-09-29 16:55:51 -04:00
Zequan Wu 6c91e623e5 [CodeGen] emit CG profile for COFF object file
Differential Revision: https://reviews.llvm.org/D87811
2020-09-29 12:03:30 -07:00
Simon Pilgrim 14ff38e235 [InstCombine] visitTrunc - trunc (lshr (sext A), C) --> (ashr A, C) non-uniform support
This came from @lebedev.ri's suggestion to use m_SpecificInt_ICMP for D88429 - since I was going to change the m_APInt to m_Constant for that patch I thought I would do it for the only other user of the APInt first.

I've added a ConstantExpr::getUMin helper - its trivial to add UMAX/SMIN/SMAX but thought I'd wait until we have use cases.

Differential Revision: https://reviews.llvm.org/D88475
2020-09-29 15:01:16 +01:00
Jay Foad d6b04f3937 [SDag] Refactor and simplify divergence calculation and checking. NFC. 2020-09-29 14:05:07 +01:00
sstefan1 cb9cfa0d2f [OpenMPOpt][Fix] Only initialize ICV initial values once.
Reviewers: jdoerfert, ggeorgakoudis

Differential Revision: https://reviews.llvm.org/D88441
2020-09-29 12:22:58 +02:00
Max Kazantsev 9100bd772d [SCEV][NFC] Introduce isBasicBlockEntryGuardedByCond
Currently, we have `isLoopEntryGuardedByCond` method in SCEV, which
checks that some fact is true if we enter the loop. In fact, this is just a
particular case of more general concept `isBasicBlockEntryGuardedByCond`
applied to given loop's header. In fact, the logic if this code is largely
independent on the given loop and only cares code above it.

This patch makes this generalization. Now we can query it for any block,
and `isBasicBlockEntryGuardedByCond` is just a particular case.

Differential Revision: https://reviews.llvm.org/D87828
Reviewed By: fhahn
2020-09-29 15:53:45 +07:00
Tres Popp eb9f7c28e5 Revert "OpaquePtr: Add type to sret attribute"
This reverts commit 55c4ff91bd.

Issues were introduced as discussed in https://reviews.llvm.org/D88241
where this change made previous bugs in the linker and BitCodeWriter
visible.
2020-09-29 10:31:04 +02:00
Amara Emerson b9f2b3bc43 [AArch64][GlobalISel] Scalarize <2 x s64> G_MUL since we don't have native support for it.
Differential Revision: https://reviews.llvm.org/D88437
2020-09-28 19:29:45 -07:00
Yonghong Song 54d9f743c8 BPF: move AbstractMemberAccess and PreserveDIType passes to EP_EarlyAsPossible
Move abstractMemberAccess and PreserveDIType passes as early as
possible, right after clang code generation.

Currently, compiler may transform the above code
  p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0);
  p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2);
  a = llvm.bpf.builtin.preserve_field_info(p2, EXIST);
  if (a) {
    p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0);
    p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2);
    bpf_probe_read(buf, buf_size, p2);
  }
to
  p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0);
  p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2);
  a = llvm.bpf.builtin.preserve_field_info(p2, EXIST);
  if (a) {
    bpf_probe_read(buf, buf_size, p2);
  }
and eventually assembly code looks like
  reloc_exist = 1;
  reloc_member_offset = 10; //calculate member offset from base
  p2 = base + reloc_member_offset;
  if (reloc_exist) {
    bpf_probe_read(bpf, buf_size, p2);
  }
if during libbpf relocation resolution, reloc_exist is actually
resolved to 0 (not exist), reloc_member_offset relocation cannot
be resolved and will be patched with illegal instruction.
This will cause verifier failure.

This patch attempts to address this issue by do chaining
analysis and replace chains with special globals right
after clang code gen. This will remove the cse possibility
described in the above. The IR typically looks like
  %6 = load @llvm.sk_buff:0:50$0:0:0:2:0
  %7 = bitcast %struct.sk_buff* %2 to i8*
  %8 = getelementptr i8, i8* %7, %6
for a particular address computation relocation.

But this transformation has another consequence, code sinking
may happen like below:
  PHI = <possibly different @preserve_*_access_globals>
  %7 = bitcast %struct.sk_buff* %2 to i8*
  %8 = getelementptr i8, i8* %7, %6

For such cases, we will not able to generate relocations since
multiple relocations are merged into one.

This patch introduced a passthrough builtin
to prevent such optimization. Looks like inline assembly has more
impact for optimizaiton, e.g., inlining. Using passthrough has
less impact on optimizations.

A new IR pass is introduced at the beginning of target-dependent
IR optimization, which does:
  - report fatal error if any reloc global in PHI nodes
  - remove all bpf passthrough builtin functions

Changes for existing CORE tests:
  - for clang tests, add "-Xclang -disable-llvm-passes" flags to
    avoid builtin->reloc_global transformation so the test is still
    able to check correctness for clang generated IR.
  - for llvm CodeGen/BPF tests, add "opt -O2 <ir_file> | llvm-dis" command
    before "llc" command since "opt" is needed to call newly-placed
    builtin->reloc_global transformation. Add target triple in the IR
    file since "opt" requires it.
  - Since target triple is added in IR file, if a test may produce
    different results for different endianness, two tests will be
    created, one for bpfeb and another for bpfel, e.g., some tests
    for relocation of lshift/rshift of bitfields.
  - field-reloc-bitfield-1.ll has different relocations compared to
    old codes. This is because for the structure in the test,
    new code returns struct layout alignment 4 while old code
    is 8. Align 8 is more precise and permits double load. With align 4,
    the new mechanism uses 4-byte load, so generating different
    relocations.
  - test intrinsic-transforms.ll is removed. This is used to test
    cse on intrinsics so we do not lose metadata. Now metadata is attached
    to global and not instruction, it won't get lost with cse.

Differential Revision: https://reviews.llvm.org/D87153
2020-09-28 16:56:22 -07:00
Amara Emerson 082321909e [GlobalISel] Add support for lowering of vector G_SELECT and use for AArch64.
The lowering is a port of the SDAG expansion.

Differential Revision: https://reviews.llvm.org/D88364
2020-09-28 14:00:46 -07:00
Benjamin Kramer b59dff4b16 [wasm] Move WasmTraits.h to BinaryFormat
There's no dependency on Object in there and this avoids a cyclic
dependency between libMC and libObject.
2020-09-28 22:07:28 +02:00
Sanjay Patel c37a8acef6 [CostModel] remove hack for intrinsic cost based on cost type
This hack seems to only have been necessary because of the
constructor bug noted in 33125cffd.

Once again, it's hard to prove NFC, but that's the hope...
2020-09-28 15:58:42 -04:00
Sanjay Patel 745abbbb85 [CostModel] move early exit for free intrinsics
This should be NFC unless some target was expecting that
some form of cttz/ctlz/memcpy is free in terms of size/latency
but not free in throughput cost.
2020-09-28 13:30:55 -04:00
Sanjay Patel 1121a583b8 [CostModel] split handling of intrinsics from other calls
This should be close to NFC (no-functional-change), but I
can't completely rule out that some call on some target
travels down a different path. There's an especially large
amount of code spaghetti in this part of the cost model.

The goal is to clean up the intrinsic cost handling so
we can canonicalize to the new min/max intrinsics without
causing regressions.
2020-09-28 13:30:55 -04:00
Jessica Paquette a52e78012a [GlobalISel] Combine (xor (and x, y), y) -> (and (not x), y)
When we see this:

```
%and = G_AND %x, %y
%xor = G_XOR %and, %y
```

Produce this:

```
%not = G_XOR %x, -1
%new_and = G_AND %not, %y
```

as long as we are guaranteed to eliminate the original G_AND.

Also matches all commuted forms. E.g.

```
%and = G_AND %y, %x
%xor = G_XOR %y, %and
```

will be matched as well.

Differential Revision: https://reviews.llvm.org/D88104
2020-09-28 10:08:14 -07:00
Paul C. Anagnostopoulos c372809f5a [TableGen] Improved messages in PseudoLoweringEmitter. 2020-09-28 10:18:22 -04:00
Georgii Rymar dab9917164 [yaml2obj][obj2yaml] - Add a support for SHT_ARM_EXIDX section.
This adds the support for SHT_ARM_EXIDX sections to obj2yaml/yaml2obj tools.

SHT_ARM_EXIDX is a ARM specific index table filled with entries.
Each entry consists of two 4-bytes values (words).
(https://developer.arm.com/documentation/ihi0038/c/?lang=en#index-table-entries)

Differential revision: https://reviews.llvm.org/D88228
2020-09-28 11:45:49 +03:00
Benjamin Kramer 7e5a356d2b [Coroutines] Remove unused includes. NFC. 2020-09-28 10:27:23 +02:00
Chuanqi Xu b3a722e66b [Coroutines] Reuse storage for local variables with non-overlapping lifetimes
bug 45566 shows the process of building coroutine frame won't consider
that the lifetimes of different local variables are not overlapped,
which means the compiler could generates smaller frame.

This patch calculate the lifetime range of each alloca by StackLifetime
class. Then the patch build non-overlapped sets for allocas whose
lifetime ranges are not overlapped. We use the largest type in a
non-overlapped set as the field type in the frame. In insertSpills
process, if we find the type of field is not the same with the alloca,
we cast the pointer to the field type to the pointer to the alloca type.
Since the lifetime range of alloca in one non-overlapped set is not
overlapped with each other, it should be ok to reuse the storage space
in the frame.

Test plan: check-llvm, check-clang, cppcoro, folly

Reviewers: junparser, lxfind, modocache

Differential Revision: https://reviews.llvm.org/D87596
2020-09-28 15:48:00 +08:00
David Sherwood bafdd11326 [SVE] Replace / operator in TypeSize/ElementCount with divideCoefficientBy
After some recent upstream discussion we decided that it was best
to avoid having the / operator for both ElementCount and TypeSize,
since this could give the impression that these classes can be used
in the same way as basic integer integer types. However, division
for scalable types is a bit odd because we are only dividing the
minimum quantity by a value, as opposed to something like:

  (MinSize * Vscale) / SomeValue

This is why when performing division it's important the caller
first establishes whether the operation makes sense, perhaps by
calling isKnownMultipleOf() prior to division. The caller must now
explictly call divideCoefficientBy() on the class to perform the
operation.

Differential Revision: https://reviews.llvm.org/D87700
2020-09-28 08:03:00 +01:00
Arthur Eubanks a2578e92e2 Revert "Reland [CodeGen] emit CG profile for COFF object file"
This reverts commit 506b6170cb.

This still causes link errors, see https://crbug.com/1130780.
2020-09-27 22:43:14 -07:00
Nikita Popov fe79061be2 [LVI][CVP] Use block value when simplifying icmps
Add a flag to getPredicateAt() that allows making use of the block
value. This allows us to take into account range information from
the current block, rather than only information that is threaded
over edges, making the icmp simplification in CVP a lot more
powerful.

I'm not changing getPredicateAt() to use the block value
unconditionally to avoid any impact on the JumpThreading pass,
which is somewhat picky about LVI query order.

Most test changes here are just icmps that now get dropped (while
previously only a result used in a return was replaced). The three
tests in icmp.ll show some representative improvements. Some of
the folds this enables have been covered by IPSCCP in the meantime,
but LVI can reason about some cases which are hard to support in
IPSCCP, such as in test_br_cmp_with_offset.

The compile-time time cost of doing this is fairly minimal, with
a ~0.05% CTMark regression for ReleaseThinLTO:
https://llvm-compile-time-tracker.com/compare.php?from=709d03f8af4da4204849a70f01798e7cebba2e32&to=6236fd503761f43c99f4537121e057a01056f185&stat=instructions

This is because the block values will typically already be queried
and cached by other CVP optimizations anyway.

Differential Revision: https://reviews.llvm.org/D69686
2020-09-27 20:25:16 +02:00
Fangrui Song 50bd71e1d7 [NewPM] Port ConstraintElimination to the new pass manager
If -enable-constraint-elimination is specified, add it to the -O2/-O3 pipeline.
(-O1 uses a separate function now.)

Reviewed By: fhahn, aeubanks

Differential Revision: https://reviews.llvm.org/D88365
2020-09-27 11:12:26 -07:00
Nikita Popov 9b959b59df [LVI] Require context instruction in external API (NFCI)
Require CxtI in getConstant() and getConstantRange() APIs.
Accordingly drop the BB parameter, as it is implied by
CxtI->getParent().

This makes sure we don't forget to pass the context instruction,
and makes the API contract clearer (also clean up the comments to
that effect -- the value holds at the context instruction, not
the end of the block).
2020-09-27 18:07:24 +02:00
Robert Widmann 55f727306e [LLVM-C] Turn a ShuffleVector Constant Into a Getter.
It is not a good idea to expose raw constants in the LLVM C API. Replace this with an explicit getter.

Differential Revision: https://reviews.llvm.org/D88367
2020-09-26 17:32:57 -06:00
Paul C. Anagnostopoulos 50a3df585d [TableGen] Add/edit Doxygen comments to match "TableGen Backend Developer's Guide." 2020-09-26 09:09:22 -04:00
Qiu Chaofan c0f8e4c06c [SelectionDAG] Add guard to automatically insert flags
This is like FastMathFlagGuard in IR. Since we use SDAG instance to get
values, it's with SelectionDAG. By creating a FlagInserter in current
scope, all values created by getNode will get the flags if no Flags
argument provided.

In this patch, I applied it to floating point operations folding part in
DAG combiner, and removed Flags passing to getNode to show its effect.
Other places in DAG combiner and other helper methods similar to getNode
also need this. They can be done in follow-up patches.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D87361
2020-09-26 13:57:52 +08:00
John Demme 76419525fb Common code preparation for tblgen-types patch
Cleanup and add methods which https://reviews.llvm.org/D86904 requires. Breaking up to lower review load.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D88267
2020-09-26 02:47:48 +00:00
Arthur Eubanks 83e3ea2cfc [LowerTypeTests][NewPM] Add constructor that uses command line flags
This matches the legacy PM pass by having one constructor use command
line flags, and the other use parameters to the pass.

This fixes all tests under Transforms/LowerTypeTests using NPM.

Reviewed By: ychen, pcc

Differential Revision: https://reviews.llvm.org/D87845
2020-09-25 17:39:59 -07:00
Michael Collison 764c1b7a4d [RISCV] Scheduler description for Bullet
Add the pipeline model for the RISC-V Bullet micro architecture.

Co-authored-by: Evandro Menezes <evandro.menezes@sifive.com>
2020-09-25 18:36:53 -05:00
Alexander Shaposhnikov 97702c3d92 [Object][MachO] Refine the interface of Slice
This patch performs a minor cleanup of the class Slice:
static methods and constructors which take a pointer but assume that
it's not null now take the argument by reference.
NFC.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D88320
2020-09-25 16:27:45 -07:00
Craig Topper b5f46534c4 [IR] Improve the description for Constant::isNormalFP to list all things that are not normal instead of just denormal. NFC 2020-09-25 16:26:46 -07:00
Craig Disselkoen 51cad041e0 C API: functions to get mask of a ShuffleVector
This commit fixes a regression (from LLVM 10 to LLVM 11 RC3) in the LLVM
C API.

Previously, commit 1ee6ec2bf removed the mask operand from the
ShuffleVector instruction, storing the mask data separately in the
instruction instead; this reduced the number of operands of
ShuffleVector from 3 to 2. AFAICT, this change unintentionally caused
a regression in the LLVM C API. Specifically, it is no longer possible
to get the mask of a ShuffleVector instruction through the C API. This
patch introduces new functions which together allow a C API user to get
the mask of a ShuffleVector instruction, restoring the functionality
which was previously available through LLVMGetOperand().

This patch also adds tests for this change to the llvm-c-test
executable, which involved adding support for InsertElement,
ExtractElement, and ShuffleVector itself (as well as constant vectors)
to echo.cpp. Previously, vector operations weren't tested at all in
echo.ll.

I also fixed some typos in comments and help-text nearby these changes,
which I happened to spot while developing this patch. Since the typo
fixes are technically unrelated other than being in the same files, I'm
happy to take them out if you'd rather they not be included in the patch.

Differential Revision: https://reviews.llvm.org/D88190
2020-09-25 16:01:05 -07:00
Eli Friedman 4600e21051 [AArch64][SVE] Drop "argmemonly" from gather/scatter with vector base.
The intrinsics don't have any pointer arguments, so "argmemonly" makes
optimizations think they don't write to memory at all.

Differential Revision: https://reviews.llvm.org/D88186
2020-09-25 16:01:05 -07:00
Simon Pilgrim 7fa464f33d Fix copy+paste typo in doxygen parameter name to fix Wdocumentation. NFCI. 2020-09-25 22:09:51 +01:00
Arthur Eubanks d3f6972abb [LoopReroll][NewPM] Port -loop-reroll to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87957
2020-09-25 12:09:06 -07:00
Thomas Lively 89fe083c19 [WebAssembly] Check features before making SjLj vars thread-local
1c5a3c4d38 updated the variables inserted by Emscripten SjLj lowering to be
thread-local, depending on the CoalesceFeaturesAndStripAtomics pass to downgrade
them to normal globals if the target features did not support TLS. However, this
had the unintended side effect of preventing all non-TLS-supporting objects from
being linked into modules with shared memory, because stripping TLS marks an
object as thread-unsafe. This patch fixes the problem by only making the SjLj
lowering variables thread-local if the target machine supports TLS so that it
never introduces new usage of TLS that will be stripped. Since SjLj lowering
works on Modules instead of Functions, this required that the
WebAssemblyTargetMachine have its feature string updated to reflect the
coalesced features collected from all the functions so that a
WebAssemblySubtarget can be created without using any particular function.

Differential Revision: https://reviews.llvm.org/D88323
2020-09-25 11:45:16 -07:00
Matt Arsenault 55c4ff91bd OpaquePtr: Add type to sret attribute
Make the corresponding change that was made for byval in
b7141207a4. Like byval, this requires a
bulk update of the test IR tests to include the type before this can
be mandatory.
2020-09-25 14:07:30 -04:00
Hans Wennborg 4f1897c6f0 Move PassBuilder::registerParseTopLevelPipelineCallback out-of-line
For some mysterious reason it doesn't build with clang-cl when compiled
as part of the includes in clang's CodeGenAction.cpp
(crbug.com/1132292).
2020-09-25 19:55:40 +02:00
Dávid Bolvanský 179e15d53a [SystemZ] Optimize bcmp calls (PR47420)
Solves https://bugs.llvm.org/show_bug.cgi?id=47420

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D87988
2020-09-25 17:55:39 +02:00
Snehasish Kumar d2696dec45 [llvm] Add -bbsections-cold-text-prefix to emit cold clusters to a different section.
This change adds an option to basic block sections to allow cold
clusters to be assigned a custom text prefix. With a custom prefix such
as ".text.split." (D87840), lld can place them in a separate output section.
The benefits are -

* Empirically shown to improve icache and itlb metrics by 3-5%
(absolute) compared to placing split parts in .text.unlikely.
* Mitigates against poor profiles, eg samplePGO profiles used with the
machine function splitter. Optimizations such as hugepage remapping can
make different decisions at the section granularity.
* Enables section granularity hotness monitoring (checking on the
decisions made during compilation vs sample data from production).

Differential Revision: https://reviews.llvm.org/D87813
2020-09-24 15:26:15 -07:00
Joseph Huber a22814194e [OpenMP] OpenMPOpt Support for Globalization Remarks
Summary:
This patch add support for printing analysis messages relating to data
globalization on the GPU. This occurs when data is shared between the
threads in a GPU context and must be pushed to global or shared memory.

Reviewers: jdoerfert

Subscribers: guansong hiraditya llvm-commits ormris sstefan1 yaxunl

Tags: #OpenMP #LLVM

Differential Revision: https://reviews.llvm.org/D88243
2020-09-24 18:23:12 -04:00
Vedant Kumar dfc5a9eb57 [Instruction] Add dropLocation and updateLocationAfterHoist helpers
Introduce a helper which can be used to update the debug location of an
Instruction after the instruction is hoisted. This can be used to safely
drop a source location as recommended by the docs.

For more context, see the discussion in https://reviews.llvm.org/D60913.

Differential Revision: https://reviews.llvm.org/D85670
2020-09-24 15:00:04 -07:00
Zequan Wu 506b6170cb Reland [CodeGen] emit CG profile for COFF object file
This reverts commit 90242caca2.

Error fixed at f5435399e8

Differential Revision: https://reviews.llvm.org/D87811
2020-09-24 14:38:53 -07:00
Daniel Kiss 2a96f47c5f [AArch64] __builtin_return_address for PAuth.
This change adds the support for __builtin_return_address
for ARMv8.3A Pointer Authentication.
Location of the authentication code in the pointer depends on
the system configuration, therefore a dedicated instruction is used for
effectively removing the authentication code without
authenticating the pointer.

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D75044
2020-09-24 23:23:49 +02:00
Andrew Litteken f02c4c87b4 [IRSim] Adding wrapper pass for IRSimilarityIdentfier
This introduces an analysis pass that wraps IRSimilarityIdentifier,
and adds a printer pass to examine in what function similarities are
being found.

Test for what the printer pass can find are in
test/Analysis/IRSimilarityIdentifier.

Reviewed by: paquette, jroelofs

Differential Revision: https://reviews.llvm.org/D86973
2020-09-24 14:59:41 -05:00
Matt Arsenault d65a7003c4 OpaquePtr: Add helpers for sret to mirror byval
Sret should really have a type parameter like byval does.
2020-09-24 09:57:28 -04:00