Commit Graph

383304 Commits

Author SHA1 Message Date
Nikita Popov d11d5d1c5f [ValueTracking] Improve mul handling in isKnownNonEqual()
X != X * C is true if:
 * C is not 0 or 1
 * X is not 0
 * mul is nsw or nuw

Proof: https://alive2.llvm.org/ce/z/uwF29z

This is motivated by one of the cases in D98422.
2021-03-21 18:41:35 +01:00
Nikita Popov f5bbdf2a67 [ValueTracking] Add more tests for isKnownNonEqual() of mul (NFC)
This is for the case of (x * C) == x, rather than the
(x * C1) == (x * C2) variant that we already cover.
2021-03-21 18:41:35 +01:00
Chris Lattner 1d909c9a35 Remove the extraneous MLIRContext argument from populateWithGenerated. NFC. 2021-03-21 10:38:35 -07:00
Matt Arsenault 20a24af01d MIR: Fix missing serialization for HasTailCall 2021-03-21 13:14:04 -04:00
Matt Arsenault a0f5aad6d7 AMDGPU: Fix allowing immediates for tail call pseudo.
The pseudo was using SSrc_b64, so it allowed folding immediates into
the destination operand for a tail call to null. However, this is not
a valid operand for the s_setpc_b64 this will be lowered to. Avoids
printing the operand as an invalid immediate.

Avoids a regression when tail calls are enabled in GlobalISel (somehow
tail calls to null get deleted in the DAG).
2021-03-21 13:14:04 -04:00
Chris Lattner ffde3acb1b [ShapeDialect] Silence a build warning, NFC
mlir/lib/Dialect/Shape/IR/Shape.cpp:573:26: warning: loop variable 'shape' is always a copy because the range of type '::mlir::Operation::operand_range' (aka 'mlir::OperandRange') does not return a reference [-Wrange-loop-analysis]
        for (const auto &shape : shapes()) {
                         ^
2021-03-21 10:10:38 -07:00
Chris Lattner 3a506b31a3 Change OwningRewritePatternList to carry an MLIRContext with it.
This updates the codebase to pass the context when creating an instance of
OwningRewritePatternList, and starts removing extraneous MLIRContext
parameters.  There are many many more to be removed.

Differential Revision: https://reviews.llvm.org/D99028
2021-03-21 10:06:31 -07:00
Nikita Popov 9f864d2025 Reapply [ConstantFold] Handle vectors in ConstantFoldLoadThroughBitcast()
There seems to be an impedance mismatch between what the type
system considers an aggregate (structs and arrays) and what
constants consider an aggregate (structs, arrays and vectors).

Adjust the type check to consider vectors as well. The previous
version of the patch dropped the type check entirely, but it
turns out that getAggregateElement() does require the constant
to be an aggregate in some edge cases: For Poison/Undef the
getNumElements() API is called, without checking in advance that
we're dealing with an aggregate. Possibly the implementation should
avoid doing that, but for now I'm adding an assert so the next
person doesn't fall into this trap.
2021-03-21 17:48:21 +01:00
Nikita Popov 59dbf4d516 [InstSimplify] Add load of undef aggregate test (NFC)
To make sure this doesn't crash the following commit.
2021-03-21 17:42:26 +01:00
Nikita Popov b32f5d5045 [InstSimplify] Regenerate test checks (NFC) 2021-03-21 17:41:21 +01:00
Nikita Popov ece1403aca [InstSimplify] Add additional select operand replacement tests (NFC)
This tests for binops with identity elements.
2021-03-21 15:30:30 +01:00
Nikita Popov daae927f9c [InstSimplify] Clean up SimplifyReplacedWithOp implementation (NFCI)
Replace Op with RepOp up-front, and then always work with the new
operands, rather than checking for replacement in various places.
2021-03-21 15:30:30 +01:00
Matt Arsenault 1098acd46d GlobalISel: Avoid unnecessary truncation to i64
We can just directly pass through the APInt to create a new constant.
2021-03-21 10:07:41 -04:00
Matt Arsenault 6314a72730 AMDGPU/GlobalISel: Enable CSE in pre-legalizer combiner 2021-03-21 10:07:37 -04:00
Simon Pilgrim 64c2641c89 [DAG] Limit (sext_in_reg (zero_extend_vector_inreg x)) to exact sign extension
As commented by @craig.topper on rG1ba5c550d418, we can't guarantee that we'll be extending zero bits, just sign bit. So, revert to the old code for zero_extend_vector_inreg cases.
2021-03-21 14:01:37 +00:00
Jez Ng 8757616de3 [lld-macho][nfc] Format Options.td
Summary: A good chunk of it was mis-indented. Fixed by using the
formatting settings from llvm/utils/vim.
2021-03-21 09:33:04 -04:00
Simon Pilgrim 3179588947 [X86][AVX] ComputeNumSignBitsForTargetNode - add X86ISD::VBROADCAST handling for scalar sources
The target shuffle code handles vector sources, but X86ISD::VBROADCAST can also accept a scalar source for splatting.

Added as an extension to PR49658
2021-03-21 12:22:51 +00:00
Simon Pilgrim dc51cc3293 [X86] Add 'mulhs' variant of PR49658 test case 2021-03-21 12:09:05 +00:00
David Green 6d9d2049c8 [ARM] VINS f16 pattern
This adds an extra pattern for inserting an f16 into a odd vector lane
via an VINS. If the dual-insert-lane pattern does not happen to apply,
this can help with some simple cases.

Differential Revision: https://reviews.llvm.org/D95471
2021-03-21 12:00:06 +00:00
luxufan 02ffbac844 [RISCV] remove redundant instruction when eliminate frame index
The reason for generating mv a0, a0 instruction is when the stack object offset is large then int<12>. To deal this situation, in the elimintateFrameIndex function, it will
create a virtual register, which needs the register scavenger to scavenge it. If the machine instruction that contains the stack object and the opcode is ADDI(the addi
was generated by frameindexNode), and then this instruction's destination register was the same as the register that was generated by the register scavenger, then the
mv a0, a0 was generated. So to eliminnate this instruction, in the eliminateFrameIndex function, if the instrution opcode is ADDI, then the virtual register can't be created.

Differential Revision: https://reviews.llvm.org/D92479
2021-03-21 18:54:00 +08:00
Simon Pilgrim 297b9bc3fa [X86][AVX] computeKnownBitsForTargetNode - add X86ISD::VBROADCAST handling for scalar sources
The target shuffle code handles vector sources, but X86ISD::VBROADCAST can also accept a scalar source for splatting.

Suggested by @craig.topper on PR49658
2021-03-21 10:40:57 +00:00
Simon Pilgrim 613157dd67 [X86] Add PR49658 test case 2021-03-21 10:16:55 +00:00
Simon Pilgrim 54a05f2ec8 [X86] computeKnownBitsForTargetNode - add X86ISD::PMULUDQ handling
Reuse the existing KnownBits multiplication code to handle what is effectively a ISD::UMUL_LOHI varient
2021-03-21 09:57:20 +00:00
Fangrui Song 2288a75d9e [Driver] Linux.cpp: add -internal-isystem lib/../$triple/include
With this change, for `#include <ar.h>`, `clang --target=aarch64-linux-gnu`
will read `/usr/lib/gcc/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/include/ar.h`
(on Debian gcc->gcc-cross)
instead of `/usr/include/ar.h`. Some glibc headers (e.g. gnu/stubs.h) are different across architectures.
2021-03-21 00:56:03 -07:00
Fangrui Song c2f9086b61 [Driver] Gnu.cpp: drop an unneeded special rule related to sysroot 2021-03-20 21:37:49 -07:00
Fangrui Song 56700e9379 [Driver] Gnu.cpp: drop an unneeded special rule related to sysroot
Seem unnecessary to diverge from GCC here.
Beside, lib/../$OSLibDir can be considered closer to the GCC
installation then the system root. The comment should not apply.
2021-03-20 21:32:55 -07:00
Fangrui Song 0ad0c476ef [Driver] Gnu.cpp: remove unneeded -L detection hack for -mx32
Removing the hack actually improves our compatibility with gcc -mx32.
2021-03-20 20:12:45 -07:00
Fangrui Song 775a294820 [Driver] Gnu.cpp: remove unneeded -L detection for libc++
If clang is installed in the system, the other -L suffice;
otherwise $ccc_install_dir/../lib below suffices.
2021-03-20 18:56:40 -07:00
Fangrui Song 06d6b1471e [Driver] Gnu.cpp: remove unneeded -L lib/gcc/$triple/$version/../../../$triple
After path resolution, it duplicates a subsequent -L entry. The entry below
(lib/gcc/$triple/$version/../../../../$OSLibDir) usually does not exist (e.g.
Arch Linux; Debian cross gcc). When it exists, it typically just has ld.so (e.g.
Debian native gcc) which cannot cause collision. Removing the -L (similar to
reordering it) is therefore justified.
2021-03-20 18:50:14 -07:00
Craig Topper 27bc30c39d [RISCV] Add test case to show a case where (mul (and X, 0xffffffff), (and Y, 0xffffffff)) optimization does not improve code.
If the mul add two users, one of which was a sext.w, the mul
would also be selected to a MULW before our pattern runs. This
causes the ANDs to now be used by the already selected MULW and
the mul we still need to select. They are unneeded on the MULW
since MULW only reads the lower bits. So they get selected to
SLLI+SRLI for the MULW use. The use for the
(mul (and X, 0xffffffff), (and Y, 0xffffffff)) manages to reuse
the SLLI.

The end result is increased register pressure and no improvement
to how soon we can start the MULW.
2021-03-20 17:54:28 -07:00
Chris Lattner 361b7d125b [Canonicalizer] Process regions top-down instead of bottom up & reuse existing constants.
This reapplies b5d9a3c / https://reviews.llvm.org/D98609 with a one line fix in
processExistingConstants to skip() when erasing a constant we've already seen.

Original commit message:

 1) Change the canonicalizer to walk the function in top-down order instead of
    bottom-up order.  This composes well with the "top down" nature of constant
    folding and simplification, reducing iterations and re-evaluation of ops in
    simple cases.
 2) Explicitly enter existing constants into the OperationFolder table before
    canonicalizing.  Previously we would "constant fold" them and rematerialize
    them, wastefully recreating a bunch fo constants, which lead to pointless
    memory traffic.

Both changes together provide a 33% speedup for canonicalize on some mid-size
CIRCT examples.

One artifact of this change is that the constants generated in normal pattern
application get inserted at the top of the function as the patterns are applied.
Because of this, we get "inverted" constants more often, which is an aethetic
change to the IR but does permute some testcases.

Differential Revision: https://reviews.llvm.org/D99006
2021-03-20 16:30:15 -07:00
Andrew Litteken 0776eca7a4 Revert "[IRSim] Adding basic implementation of llvm-sim."
Causing build errors on the Windows Buildbots.

This reverts commit 5155dff278.
2021-03-20 18:03:09 -05:00
Jessica Clarke b2bb003774
[RISCV] Update comment in RISCVInstrInfoM.td
Missed in 07ed62b7d5.
2021-03-20 22:35:40 +00:00
Craig Topper 07ed62b7d5 [RISCV] Disable (mul (and X, 0xffffffff), (and Y, 0xffffffff)) optimization when Zba is enabled.
This optimization is trying to save SRLI instructions needed to
implement the ANDs. If we have zext.w we won't save anything.
Because we don't check that the multiply is the only user of the
AND we might even increase instruction count.
2021-03-20 15:31:45 -07:00
Craig Topper 0874281d60 [RISCV] Add Zba command lines to xaluo.ll. NFC
Some of the patterns end up with 32 to 64 bit zero extends on RV64
which can be handled by zext.w.
2021-03-20 15:31:45 -07:00
Fangrui Song 1fe1e996e9 [test] Delete "-internal-isystem" "/usr/local/include" 2021-03-20 15:24:02 -07:00
Craig Topper b0d8823a8a [RISCV] Add isel pattern to optimize (mul (and X, 0xffffffff), (and Y, 0xffffffff)) on RV64
This patterns computes the full 64 bit product of a 32x32 unsigned
multiply. This requires a two pairs of SLLI+SRLI to zero the
upper 32 bits of the inputs.

We can do better than this by using two SLLI to move the lower
bits to the upper bits then use MULHU to compute the product. This
is the high half of a full 64x64 product. Since we put 32 0s in the lower
bits of the inputs we know the 128-bit product will have zeros in the
lower 64 bits. So the upper 64 bits, which MULHU computes, will contain
the original 64 bit product we were after.

The same trick would work for (mul (sext_inreg X, i32), (sext_inreg Y, i32))
using MULHS, but sext_inreg is sext.w which is already one instruction so we
wouldn't save anything.

Differential Revision: https://reviews.llvm.org/D99026
2021-03-20 14:55:46 -07:00
Andrew Litteken 5155dff278 [IRSim] Adding basic implementation of llvm-sim.
This is a similarity visualization tool that accepts a Module and
passes it to the IRSimilarityIdentifier.  The resulting SimilarityGroups
are output in a JSON file.

Tests are found in test/tools/llvm-sim and check for the file not found,
a bad module, and that the JSON is created correctly.

Reviewers: paquette, jroelofs, MaskRay

Recommit of: 15645d044b to fix linking
errors.

Differential Revision: https://reviews.llvm.org/D86974
2021-03-20 16:47:50 -05:00
Jinsong Ji 14696baaf4 [AIX] Update rpath for BUILD_SHARED_LIBS
BUILD_SHARED_LIBS build llvm component as shared library,
which can reduce the size a lot.

Normally, the binary use ORIGIN../lib to load component libraries,
unfortunatly, ORIGIN is not supported by AIX ld.

We hardcoded the build lib and install lib path in rpath for now
to enable BUILD_SHARED_LIBS build.

Understand that this is not perfect solution,
we can update this when we find better solution.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D98901
2021-03-20 20:31:43 +00:00
Fangrui Song f628ba0b55 [test] Fix Driver/gcc-toolchain.cpp if CLANG_DEFAULT_RTLIB is compiler-rt 2021-03-20 13:24:49 -07:00
Sanjay Patel ee8b53815d [BranchProbability] move options for 'likely' and 'unlikely'
This makes the settings available for use in other passes by housing
them within the Support lib, but NFC otherwise.

See D98898 for the proposed usage in SimplifyCFG
(where this change was originally included).

Differential Revision: https://reviews.llvm.org/D98945
2021-03-20 14:46:46 -04:00
Jez Ng 47fdaa32f9 [lld-macho] Minor touch-up to objc.s 2021-03-20 14:42:16 -04:00
Stephen Kelly 188405bc19 [AST] Ensure that an empty json file is generated if compile errors
Differential Revision: https://reviews.llvm.org/D98827
2021-03-20 18:08:01 +00:00
Fangrui Song e92faa77b4 [test] Fix Driver/gcc-toolchain.cpp if CLANG_DEFAULT_CXX_STDLIB is libc++ 2021-03-20 11:06:44 -07:00
Fangrui Song 879760c245 [VE] Fix types of multiclass template arguments in TableGen files
There were not properly checked before `[TableGen] Improve handling of template arguments`.
2021-03-20 10:36:51 -07:00
Fangrui Song dc3b438c8f Revert "Revert "[Driver] Drop obsoleted Ubuntu 11.04 gcc detection""
This reverts commit 243333ef3e.
2021-03-20 09:57:05 -07:00
Vaivaswatha Nagaraj f860187ea6 [OCaml] Add (get/set)_module_identifer functions
Also:

- Fix a bug that crept in when fixing a buildbot failure in
f7be9db622
- Use mlsize_t for cstr_to_string as that is what
caml_alloc_string specifies.

Differential Revision: https://reviews.llvm.org/D98851
2021-03-20 20:41:51 +05:30
David Zarzycki 5cbe2279f7 [lit] Sort testing summary output
As fallout from from the record-and-reorder work, people asked that the
summary output be sorted to aid diffing.
2021-03-20 07:52:08 -04:00
David Zarzycki 243333ef3e Revert "[Driver] Drop obsoleted Ubuntu 11.04 gcc detection"
This reverts commit bdf39e6b0e.

The change is failing on Fedora 33 (x86-64).
2021-03-20 07:29:01 -04:00
Nathan James 4dd92d61db
[clang-tidy] Fix bugprone-terminating-continue when continue appears inside a switch
Don't emit a warning if the `continue` appears in a switch context as changing it to `break` will break out of the switch rather than a do loop containing the switch.
Fixes https://llvm.org/PR49492.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D98338
2021-03-20 10:59:37 +00:00