Commit Graph

356544 Commits

Author SHA1 Message Date
Guillaume Chatelet be4f5061ea [Alignment][NFC] Migrate part of Arm/AArch64 backend
Summary: Follow up on D81196

Reviewers: courbet

Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81274
2020-06-08 07:15:37 +00:00
Simon Wallis 7432fb2c78 [ARM][XO] Execute-only miscompiles double literals for big-endian
Summary:
With -mbig-endian -mexecute-only and targeting an fpu,
an incorrect sequence of movw/movt was generated to construct a double literal.
The test suite was hardwired to check these wrong values.

The fault was caused by the explicit word swap in LowerConstantFP().

With -mbig-endian -mexecute-only -mfpu=none, a correct sequence of
movw/movt is generated to construct a double literal.
The test suite did not test this no fpu case.

The test suite expected values have been corrected.
The test file is updated to add testing of fpu=none case

Reviewers: christof, llvm-commits, dmgreen

Reviewed By: dmgreen

Subscribers: dmgreen, kristof.beyls, hiraditya, danielkiss

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81259

Change-Id: Ia3737df243218c89c82f02b7f9f4032ecd5a3917
2020-06-08 08:13:08 +01:00
Guillaume Chatelet 2aa483016d [Alignment][NFC] Migrate CallingConv tablegen code
Summary: This is a follow up on D81196, fixing one more call site.

Reviewers: courbet

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81268
2020-06-08 06:58:02 +00:00
Max Kazantsev 005db9c361 [Test] Add test showing InstCombine missing simplification opportunity 2020-06-08 13:19:09 +07:00
Craig Topper b0eea7213b [X86] Support load shrinking for strict fp nodes in combineCVTPH2PS 2020-06-07 21:09:55 -07:00
QingShan Zhang 3f0cc7ac5e [NFC] Remove the extra ; to avoid the warning of build compiler 2020-06-08 03:51:05 +00:00
Nemanja Ivanovic a56d057dfe [PowerPC] Do not assume operand of ADDI is an immediate
After pseudo-expansion, we may end up with ADDI (add immediate)
instructions where the operand is not an immediate but a relocation.
For such instructions, attempts to get the immediate result in
assertion failures for obvious reasons.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45432
2020-06-07 22:18:31 -05:00
Craig Topper e3aece06cf [X86] Improve (vzmovl (insert_subvector)) combine to handle a bitcast between the vzmovl and insert
This combine tries shrink a vzmovl if its input is an
insert_subvector. This patch improves it to turn
(vzmovl (bitcast (insert_subvector))) into
(insert_subvector (vzmovl (bitcast))) potentially allowing the
bitcast to be folded with a load.
2020-06-07 19:31:06 -07:00
Craig Topper 22987babd5 [X86] Teach combineCVTP2I_CVTTP2I to handle STRICT_CVTTP2SI/STRICT_CVTTP2UI
Allows us to shrink 128-bit simple load to enable folding for
v2f32->v2i64 vcvttps2qq/vcvttps2uqq.
2020-06-07 19:31:06 -07:00
QingShan Zhang f8eabd6d01 [Power9] Add addi post-ra scheduling heuristic
The instruction addi is usually used to post increase the loop indvar, which looks like this:

label_X:
 load x, base(i)
 ...
 y = op x
 ...
 i = addi i, 1
 goto label_X

However, for PowerPC, if there are too many vsx instructions that between y = op x and  i = addi i, 1,
it will use all the hw resource that block the execution of i = addi, i, 1, which result in the stall
of the load instruction in next iteration. So, a heuristic is added to move the addi as early as possible
to have the load hide the latency of vsx instructions, if other heuristic didn't apply to avoid the starve.

Reviewed By: jji

Differential Revision: https://reviews.llvm.org/D80269
2020-06-08 01:31:07 +00:00
Jun Ma a0de3335ed [clang] Implement VectorType logic not operator.
Differential Revision: https://reviews.llvm.org/D80979
2020-06-08 08:41:01 +08:00
Craig Topper a135c4a2cf [X86] Don't scalarize v2f32->v2i64 strict_fp_to_sint/uint with avx512dq and not avx512vl.
We can pad the v2f32 with 0s up to v8f32 and use a v8f32->v8i64
operation. This is what we end up with on non-strict nodes except
we don't pad with 0s since we don't care about exceptions.
2020-06-07 14:45:26 -07:00
Benjamin Kramer 3badd17b69 SmallPtrSet::find -> SmallPtrSet::count
The latter is more readable and more efficient. While there clean up
some double lookups. NFCI.
2020-06-07 22:38:08 +02:00
Fangrui Song b6e143aa54 Reland D80966 [codeview] Put !heapallocsite on calls to operator new
With a change to use `CGM.getCodeGenOpts().getDebugInfo() != codegenoptions::NoDebugInfo`
instead of `getDebugInfo()`,
to fix `Profile-<arch> :: instrprof-gcov-multithread_fork.test`

See CodeGenModule::CodeGenModule, `EmitGcovArcs || EmitGcovNotes` can
set `clang::CodeGen::CodeGenModule::DebugInfo`.

---

Clang marks calls to operator new as heap allocation sites, but the
operator declared at global scope returns a void pointer. There is no
explicit cast in the code, so the compiler has to write down the
allocated type itself.

Also generalize a cast to use CallBase, so that we mark heap alloc sites
when exceptions are enabled.

Differential Revision: https://reviews.llvm.org/D80966
2020-06-07 13:35:20 -07:00
Simon Pilgrim ce677ef532 [X86][AVX2] combineSetCCMOVMSK - handle all_of patterns for PMOVMSKB(PACKSSBW(LO(X), HI(X)))
In the sign splat case, we can fold PMOVMSKB(PACKSSBW(LO(X), HI(X))) -> PMOVMSKB(BITCAST_v32i8(X)) without introducing a signmask + comparison (which unlike for any_of won't fold into a single TEST).
2020-06-07 21:08:53 +01:00
Mehdi Amini a25f5cd70c Revert "[MLIR] Lower shape.num_elements -> shape.reduce."
This reverts commit e80617df89.

This broke the build with `-DBUILD_SHARED_LIBS=ON`
2020-06-07 19:32:36 +00:00
Fangrui Song 336e1f03d1 [Driver] Omit -mthread-model posix which is the CC1 default 2020-06-07 12:27:11 -07:00
Fangrui Song e3200dab60 [gcov] Support .gcno/.gcda in gcov 8, 9 or 10 compatible formats 2020-06-07 11:27:49 -07:00
Benjamin Kramer 02e35832c3 [Driver] Simplify code. NFCI. 2020-06-07 20:18:14 +02:00
AK 96458fc510 Add cl::ZeroOrMore to get around build system issues
It is quite common to get multiple instances of optimization flags while building.
The following optimizations does not have cl::ZeroOrMore which causes errors during the build.

Reviewers: alexbdv,spop

Differential Revision: https://reviews.llvm.org/D81187
2020-06-07 10:15:18 -07:00
Kang Zhang c3f5ceefb8 [NFC][PowerPC] Add a new case to test ctrloop for fp128 2020-06-07 16:35:32 +00:00
Fangrui Song b2ffe940b0 [gcov] Fix instrprof-gcov-__gcov_flush-terminate.test 2020-06-07 09:31:52 -07:00
Simon Pilgrim 175fc4023a CFG.h - reduce includes to forward declarations. NFC. 2020-06-07 17:25:35 +01:00
Benjamin Kramer 98626f78ae Unbreak the build 2020-06-07 18:17:21 +02:00
Benjamin Kramer 27e0077dcf Try to make msvc crash less
llvm-project\clang\lib\Driver\Types.cpp(44): fatal error C1001: An internal error has occurred in the compiler.
(compiler file 'msc1.cpp', line 1518)
2020-06-07 18:07:07 +02:00
Simon Pilgrim f6cb987d50 DomTreeUpdater.h - refine includes. NFC.
We don't need any of its defs or many of its includes inside PostDominators.h - so split it and reduce the frontend load.
2020-06-07 16:57:48 +01:00
Fangrui Song bfce849d83 [gcov][test] Delete UNSUPPORTED: host-byteorder-big-endian from test/profile tests
It seems that after dc52ce424b, all big-endian problems have been fixed.

01899bb4e4 seems to have fixed XFAIL: * of
profile/instrprof-gcov-__gcov_flush-terminate.test

This essentially reverts commit 5a9b792d72 and
93d5ae3af1.
2020-06-07 08:42:57 -07:00
Benjamin Kramer c0c6a12775 Put back definitions. We're still not C++17 :/ 2020-06-07 17:41:02 +02:00
Benjamin Kramer 5a098086f9 Put compilation phases from Types.def into a bit set
This avoids a global constructor and is a bit more efficient for
"contained" queries. No functionality change intended.
2020-06-07 17:22:44 +02:00
Benjamin Kramer 0c3df70fad Remove global std::string. StringRef is sufficient. NFC. 2020-06-07 17:22:44 +02:00
Shawn Landden 53a4bfa803 [AArch64] add test for large popcount; NFC 2020-06-07 19:20:33 +04:00
Sanjay Patel ad19b9cead [Docs] fix typos for llvm-mca; NFC 2020-06-07 11:14:24 -04:00
Simon Pilgrim 3a28ae091b [X86][SSE] combineSetCCMOVMSK - add initial support for allof patterns.
Handle MOVMSK 'allof' comparisons (X86ISD::SUB X, AllBitsMask) as well as 'anyof' patterns.

This allows us to handle these patterns in the MOVMSK(BITCAST(X)) pattern to fix PR37087.
2020-06-07 16:10:13 +01:00
Fangrui Song dc52ce424b [llvm-cov] Fix gcov version detection on big-endian 2020-06-07 08:07:32 -07:00
Xing GUO a68601b3fa [ObjectYAML][test] Address comments in D80203
This patch addresses comments in [D80203](https://reviews.llvm.org/D80203?vs=on&id=266415&whitespace=ignore-most#2062287)

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D80862
2020-06-07 22:55:33 +08:00
Xing GUO d527690103 [DWARFYAML][debug_ranges] Fix inappropriate assertion. NFC. 2020-06-07 22:45:52 +08:00
Alexander Belyaev e80617df89 [MLIR] Lower shape.num_elements -> shape.reduce.
Differential Revision: https://reviews.llvm.org/D81279
2020-06-07 16:39:21 +02:00
Alexander Belyaev 50f68c1e33 [mlir] Add verifier for `shape.yield`.
Differential Revision: https://reviews.llvm.org/D81262
2020-06-07 15:40:11 +02:00
Sanjay Patel 2552f65183 [InstCombine] fold mask op into casted shift (PR46013)
https://rise4fun.com/Alive/Qply8

  Pre: C2 == (-1 u>> zext(C1))
  %a = ashr %x, C1
  %s = sext %a to i16
  %r = and i16 %s, C2
    =>
  %s2 = sext %x to i16
  %r = lshr i16 %s2, zext(C1)

https://bugs.llvm.org/show_bug.cgi?id=46013
2020-06-07 09:33:18 -04:00
Sanjay Patel c6719d0b47 [InstCombine] add tests for bitmask of casted shift; NFC (PR46013) 2020-06-07 09:33:18 -04:00
Ties Stuij 5945e9799e [clang][BFloat] Add reinterpret cast intrinsics
Summary:
This patch is part of a series implementing the Bfloat16 extension of the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties is specified in the Arm C language
extension specification:

https://developer.arm.com/docs/ihi0055/d/procedure-call-standard-for-the-arm-64-bit-architecture

Subscribers: kristof.beyls, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79869

The following people contributed to this patch:

- Luke Cheeseman
- Alexandros Lamprineas
- Luke Geeson
- Ties Stuij
2020-06-07 14:32:37 +01:00
Simon Pilgrim 1c2d2c88b4 AlignmentFromAssumptions.h - reduce includes to forward declarations. NFC. 2020-06-07 13:51:48 +01:00
Simon Pilgrim 6602e4ca4b MemorySSAUpdater.h - reduce includes to forward declarations. NFC. 2020-06-07 13:16:31 +01:00
Simon Pilgrim 3642d38823 DependenceAnalysis.h - reduce AliasAnalysis.h include to forward declaration. NFC.
This requires the replacement of legacy class AliasAnalysis usages with AAResults (which it typedefs to anyhow)
2020-06-07 12:47:37 +01:00
Simon Pilgrim b296fd2024 MustExecute.h - remove unnecessary Instruction.h include. NFC.
We already have the Instruction forward declaration.
2020-06-07 12:10:50 +01:00
Simon Pilgrim 91591ec424 ObjCARCAnalysisUtils.h - remove unused LLVMContext.h include. NFC. 2020-06-07 11:48:46 +01:00
Simon Pilgrim 1e9d2f908e OrderedInstructions.h - reduce includes to forward declarations. NFC. 2020-06-07 11:44:43 +01:00
Simon Pilgrim 52d6950c47 [X86][SSE] Extend ICMP(MOVMSK(BITCAST(X))) tests to allof patterns as well as the existing noneof/anyof patterns. 2020-06-07 11:44:43 +01:00
Simon Pilgrim 0741b75ad5 [X86][SSE] Attempt to widen MOVMSK vector input if the signbits are splatted.
As shown on PR37087, if we have a MOVMSK(BICAST(X)) from a wider vector, then by using MOVMSK from the wider type (32/64-bit elements) we can improve the chances of further combines with SimplifyDemandedBits/Elts and on some targets (skylake) can be more efficient.
2020-06-07 11:44:43 +01:00
Florian Hahn 4affc444b4 [Matrix] Implement * binary operator for MatrixType.
This patch implements the * binary operator for values of
MatrixType. It adds support for matrix * matrix, scalar * matrix and
matrix * scalar.

For the matrix, matrix case, the number of columns of the first operand
must match the number of rows of the second. For the scalar,matrix variants,
the element type of the matrix must match the scalar type.

Reviewers: rjmccall, anemet, Bigcheese, rsmith, martong

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D76794
2020-06-07 11:11:27 +01:00