Commit Graph

386005 Commits

Author SHA1 Message Date
Louis Dionne 4cd6ca102a [libc++] NFC: Normalize `#endif //` comment indentation 2021-04-20 12:03:32 -04:00
Matt Arsenault 620fdb9671 GlobalISel: Defer register creation in handleAssignments
This is currently built on top of the SelectionDAG call lowering, but
does not use it the same way. SelectionDAG passes legalized types to
the assignment functions, and the tablegenerated assignment functions
may change the value types expected for registers. This does not
change the types used, just moves the register creation to help fix
this in the future.

Defer the register creation until after all of the assignment
decisions have been made. This will also help have correct tail call
compatibility checking in a future change. Currently it does not work
as expected for any arguments split across multiple registers.
2021-04-20 11:48:12 -04:00
Jay Foad ec8c61efdf [AMDGPU] Allow multiple uses of the same literal
In GFX10 VOP3 can have a literal, which opens up the possibility of two
operands using the same literal value, which is allowed and only counts
as one use of the constant bus.

AMDGPUAsmParser::validateConstantBusLimitations already knew about this
but SIInstrInfo::verifyInstruction did not.

Differential Revision: https://reviews.llvm.org/D100770
2021-04-20 16:44:01 +01:00
Ahmed Bougacha a0573b6c10 [AArch64] Bump apple-latest CPU alias to apple-a14. 2021-04-20 08:41:04 -07:00
Ahmed Bougacha cedb5b06df [AArch64] Don't always override CPU for arm64e.
This demotes the apple-a12 CPU selection for arm64e to just be the
last-resort default.  Concretely, this means:
- an explicitly-specified -mcpu will override the arm64e default;
  a user could potentially pick an invalid CPU that doesn't have
  v8.3a support, but that's not a major problem anymore
- arm64e-apple-macos (and variants) will pick apple-m1 instead of
  being forced to apple-a12.
2021-04-20 08:41:04 -07:00
Ahmed Bougacha a8a3a43792 [AArch64] Add apple-m1 CPU, and default to it for macOS.
apple-m1 has the same level of ISA support as apple-a14,
so this is a straightforward mechanical change.  However, that
also means this inherits apple-a14's v8.5a+nobti quirkiness.

rdar://68287159
2021-04-20 08:41:04 -07:00
LLVM GN Syncbot d51b22d782 [gn build] Port 120fa8293e 2021-04-20 15:33:43 +00:00
zoecarver 120fa8293e [libc++][nfc] Move iterator_traits and related into __iterator/iterator_traits.h.
Based on D100682 and D99855.

(Note: I originally was going to just make this part of D99855, but I decided not to because this patch moves lots of unrelated code around, and I didn't want to make D99855 harder to review because of unrelated code-changes/moves.)

Differential Revision: https://reviews.llvm.org/D100686
2021-04-20 08:31:34 -07:00
Matt Arsenault 14b03b4aad GlobalISel: Check for powers of 2 for inverse funnel shift lowering
This doesn't make a practical difference since it would only be broken
if a target actually had a legal non-power-of-2 inverse shift.
2021-04-20 11:30:22 -04:00
zoecarver 9f01ac3b32 [libcxx] makes `iterator_traits` C++20-aware
* adds `iterator_traits` specialisation that supports all expected
  member aliases except for `pointer`
* adds `iterator_traits` specialisations for iterators that meet the
  legacy iterator requirements but might lack multiple member aliases
* makes pointer `iterator_traits` specialisation require objects

Depends on D99854.

Differential Revision: https://reviews.llvm.org/D99855
2021-04-20 11:30:08 -04:00
Alexey Bataev b82344a019 Revert "[SLP] Add detection of shuffled/perfect matching of tree entries."
This reverts commit daf6e18c55 to fix the
compiler crash.
2021-04-20 08:29:32 -07:00
David Green 21a8b9d9e9 [ARM] Limit PerformExtractEltToVMOVRRD to when f64 is legal.
The generic SoftFloatVectorExtract.ll test was failing when run on arm
machines, as it tries to create a f64 under soft float. Limit the
transform to when f64 is legal.

Also add a missing override, as reported in D100244.
2021-04-20 16:24:36 +01:00
Matt Arsenault 1cb8a9d595 AMDGPU/GlobalISel: Fix uitofp/sitofp with non-power-of-2 integers 2021-04-20 11:13:29 -04:00
Erich Keane 0ed613612c Ensure target-multiversioning emits deferred declarations
As reported in PR50025, sometimes we would end up not emitting functions
needed by inline multiversioned variants. This is because we typically
use the 'deferred decl' mechanism to emit these.  However, the variants
are emitted after that typically happens.  This fixes that by ensuring
we re-run deferred decls after this happens. Also, the multiversion
emission is done recursively to ensure that MV functions that require
other MV functions to be emitted get emitted.
2021-04-20 08:10:26 -07:00
Matt Arsenault 83a25a1010 GlobalISel: Restrict narrow scalar for fptoui/fptosi results
This practically only works for the f16 case AMDGPU uses, not wider
types.

Fixes bug 49710 by failing legalization.
2021-04-20 10:54:40 -04:00
Matt Arsenault 8fbe04f46b MachineVerifier: Continue reporting errors for copies
This was skipping verification of later copies, but generally the
verifier tries to report as many things wrong as possible in the
function.
2021-04-20 10:54:40 -04:00
Alexey Bataev daf6e18c55 [SLP] Add detection of shuffled/perfect matching of tree entries.
SLP supports perfect diamond matching for the vectorized tree entries
but do not support it for gathered entries and does not support
non-perfect (shuffled) matching with 1 or 2 tree entries. Patch adds
support for this matching to improve cost of the vectorized tree.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D100495
2021-04-20 07:46:49 -07:00
Hanhan Wang 7b7df8e85e [mlir][StandardToSPIRV] Add support for lowering std.xor on bool to SPIR-V
std.xor ops on bool are lowered to spv.LogicalNotEqual. For Boolean values, xor
and not-equal are the same thing.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D100817
2021-04-20 07:35:20 -07:00
Nico Weber 476155e68e [gn build] reformat all gn files
$ git ls-files '*.gn' '*.gni' | xargs llvm/utils/gn/gn.py format

(and manually wrap two comments)
2021-04-20 10:34:08 -04:00
Bradley Smith b8b075d8d7 [AArch64][SVE] Lower MULHU/MULHS nodes to umulh/smulh instructions
Mark MULHS/MULHU nodes as legal for both scalable and fixed SVE types,
and lower them to the appropriate SVE instructions.

Additionally now that the MULH nodes are legal, integer divides can be
expanded into a more performant code sequence.

Differential Revision: https://reviews.llvm.org/D100487
2021-04-20 15:18:06 +01:00
Alexey Bataev cf00cb8bed Revert "[SLP] Add detection of shuffled/perfect matching of tree entries."
This reverts commit b232771aca to fix
buildbots.
2021-04-20 07:16:11 -07:00
David Green 48cef1fa8e [ARM] Create VMOVRRD from adjacent vector extracts
This adds a combine for extract(x, n); extract(x, n+1)  ->
VMOVRRD(extract x, n/2). This allows two vector lanes to be moved at the
same time in a single instruction, and thanks to the other VMOVRRD folds
we have added recently can help reduce the amount of executed
instructions. Floating point types are very similar, but will include a
bitcast to an integer type.

This also adds a shouldRewriteCopySrc, to prevent copy propagation from
DPR to SPR, which can break as not all DPR regs can be extracted from
directly.  Otherwise the machine verifier is unhappy.

Differential Revision: https://reviews.llvm.org/D100244
2021-04-20 15:15:43 +01:00
Andrzej Warzynski 6d0fef4860 [flang][driver] Refactor methods for parsing options (nfc)
This is just a small update that makes sure that errors arising from
parsing command-line options are captured more visibly. Also, all
parsing methods will now consistently return either a bool ("may fail")
or void ("never fails").

An instance of `InputKind` coming from `-x` is added to
`FrontendOptions` rather then being returned from `ParseFrontendArgs`.
It's currently not used, but we will require it shortly. In particular,
once code-generation is available we will use it to differentiate
between LLVM IR and Fortran input. `FrontendOptions` is a very suitable
place to keep it.

This changes don't affect the error reporting in the driver. In this
respect these are non-functional-changes. However, it will simplify
things in the forthcoming patches in which we may need a better error
tracking/recovery mechanism.

Differential Revision: https://reviews.llvm.org/D100556
2021-04-20 14:00:45 +00:00
Alexey Bataev b232771aca [SLP] Add detection of shuffled/perfect matching of tree entries.
SLP supports perfect diamond matching for the vectorized tree entries
but do not support it for gathered entries and does not support
non-perfect (shuffled) matching with 1 or 2 tree entries. Patch adds
support for this matching to improve cost of the vectorized tree.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D100495
2021-04-20 06:55:55 -07:00
Cullen Rhodes f166d0db71 [AArch64][AsmParser] NFC: Remove unused ExtendOp struct
Left over from 2625a993f9 when extend and shift were merged.
2021-04-20 13:45:09 +00:00
Thomas Preud'homme fd941036bf Fix PR46880: Fail CHECK-NOT with undefined variable
Currently a CHECK-NOT directive succeeds whenever the corresponding
match fails. However match can fail due to an error rather than a lack
of match, for instance if a variable is undefined. This commit makes match
error a failure for CHECK-NOT.

Reviewed By: jdenny

Differential Revision: https://reviews.llvm.org/D86222
2021-04-20 14:42:46 +01:00
Sebastian Neubauer 4897effb14 [AMDGPU] Add TransVALU to gfx10
Instructions on the transcendental unit are executed in parallel to the
normal VALU, so add this as an extra resource.

This doesn't seem to have any effect, but it should be more correct.

Differential Revision: https://reviews.llvm.org/D100123
2021-04-20 15:34:43 +02:00
Fraser Cormack 60622b82a7 [RISCV][NFC] Add tests for scalable-vector DAGCombiner improvements
These will all be improved by future patches.
2021-04-20 14:26:26 +01:00
Jay Foad 2aea830ec4 [AMDGPU] Use if instead of foreach in a few places. NFC. 2021-04-20 14:20:30 +01:00
Andrzej Warzynski c2e452fb05 [flang][nfc] Port 2 tests to use the new driver when enabled
This is similar to https://reviews.llvm.org/D100309, i.e. `%f18` is
replaced with `%flang_new`.

resolve105.f90 wasn't in tree when D100309 was worked on, so it's
updated here instead.

label14.f90 requires `-fsyntax-only`. I didn't notice that when
submitting D100309, hence updating it now instead. `-fsyntax-only` is
required to prevent `%f18` from calling an external compiler (which then
fails and returns a non-zero exit code).

Differential Revision: https://reviews.llvm.org/D100655
2021-04-20 12:49:47 +00:00
Louis Dionne 2704d0a701 [libc++][ci] Re-split the CI pipeline to try and reduce load on more builders 2021-04-20 08:37:52 -04:00
Andrea Di Biagio 2226d21896 [MCA][LSUnit] Fix a potential use after free in the logic that updates memory groups.
Make sure that the `CriticalMemoryInstruction` of a memory group is invalidated
if it references an already executed instruction.  This avoids a potential
use-after-free if the critical memory info becomes stale, and the value is
read after the instruction has executed.
2021-04-20 13:30:45 +01:00
Nemanja Ivanovic 03e7fefff8 [PowerPC] Canonicalize shuffles on big endian targets as well
Extend shuffle canonicalization and conversion of shuffles fed by vectorized
scalars to big endian subtargets. For big endian subtargets, loads and direct
moves of scalars into vector registers put the data in the correct element for
SCALAR_TO_VECTOR if the data type is 8 bytes wide. However, if the data type is
narrower, the value still ends up in the wrong place - althouth a different
wrong place than on little endian targets.

This patch extends the combine that keeps values where they are if they feed a
shuffle to big endian targets.

Differential revision: https://reviews.llvm.org/D100478
2021-04-20 07:29:47 -05:00
Nico Weber 1a3f88658a [llvm-objdump] Add an llvm-otool tool
This implements an LLVM tool that's flag- and output-compatible
with macOS's `otool` -- except for bugs, but from testing with both
`otool` and `xcrun otool-classic`, llvm-otool matches vanilla
otool's behavior very well already. It's not 100% perfect, but
it's a very solid start.

This uses the same approach as llvm-objcopy: llvm-objdump uses
a different OptTable when it's invoked as llvm-otool. This
is possible thanks to D100433.

Differential Revision: https://reviews.llvm.org/D100583
2021-04-20 08:24:58 -04:00
Cullen Rhodes 8a6772f3aa [ValueTypes] Fix sizes of v256i32 and v256f32 (8182 -> 8192) 2021-04-20 12:10:02 +00:00
Jay Foad edea476142 [AMDGPU] Use simpler alternatives to !foldl. NFC. 2021-04-20 12:59:04 +01:00
Tobias Gysi b9715156ff [mlir][linalg] lower index operations during linalg to vector lowering.
The patch extends the vectorization pass to lower linalg index operations to vector code. It allocates constant 1d vectors that enumerate the indexes along the iteration dimensions and broadcasts/transposes these 1d vectors to the iteration space.

Differential Revision: https://reviews.llvm.org/D100373
2021-04-20 11:55:44 +00:00
Simon Pilgrim e156f2515c [DAG] SelectionDAG.cpp - breakup if-else chains where each block returns. NFCI.
Match style guide that requests that if+return blocks are separate.
2021-04-20 12:37:00 +01:00
Simon Pilgrim fce8c10b68 Fix Wdocumentation warning by consistently using '///' comment blocks. NFCI. 2021-04-20 12:37:00 +01:00
Tobias Gysi 856b24df08 [mlir] test gather/scatter index vector of type index.
Test the vector to llvm lowering of index vectors with index element type.

Differential Revision: https://reviews.llvm.org/D100827
2021-04-20 11:24:04 +00:00
Thomas Preud'homme d618c6e8ce [lit, test] Fix test cancellation feature detection
A lit feature guards tests for the lit timeout functionality because on
most system it depends on the availability of the psutil Python module.
However, that feature is defined based on the ability of the testing lit
to cancel test, which does not necessarily apply to the ability of the
tested lit.

In particular, RUN commands have a cleared PYTHONPATH and user site
packages are disabled. In the case where psutil is found by the testing
lit from one of those two source of python path, the tested lit would
not be able to find it, causing timeout tests to fail.

This commit fixes the issue by testing the ability to cancel tests in
the RUN command environment.

Reviewed By: yln

Differential Revision: https://reviews.llvm.org/D99728
2021-04-20 12:09:30 +01:00
Martin Probst 3d4a6037ff clang-format: [JS] do not merge imports and exports.
Previously, clang-format would erroneously merge import and export
statements. These need to be kept separate, as the semantics differ.

Differential Revision: https://reviews.llvm.org/D100752
2021-04-20 13:08:18 +02:00
Thomas Preud'homme 8cee150e9a [C++, test] Fix typo in NSS* vars
The NSS FileCheck variables at the end of the
CodeGenCXX/split-stacks.cpp clang testcase are off by 1, resulting in
the use of an undefined variable (NSS3). One of the CHECK-NOT is also
redundant because _Z8tnosplitIiEiv uses the same attribute as _Z3foov
without split stack. This commit fixes that.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D99839
2021-04-20 12:07:41 +01:00
hsmahesha 840c4e4e90 [AMDGPU] Re-arrange ds_read/ds_write ISel pattern for better readability.
Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D100773
2021-04-20 16:17:15 +05:30
Dávid Bolvanský 319c9f6e58 [MemoryBuiltins] Added support for memalign
memalign is older aligned_alloc.
2021-04-20 12:39:54 +02:00
Simon Pilgrim 5ed8cea9a8 [Support] APInt.h - remove <algorithm> include. NFCI.
Replace std::min use which should allow us to avoid including the <algorithm> header in every include of APInt.h.
2021-04-20 11:21:39 +01:00
Simon Pilgrim 1c6df71a9b [CodeGen] CodeGenPassBuilder.h - remove unnecessary <string> include. NFCI.
We only use StringRef so include that.
2021-04-20 11:21:39 +01:00
Ben Shi 30e2c7be99 [RISCV] Refactor an optimization of addition with immediate
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D100769
2021-04-20 18:04:25 +08:00
Joe Ellis effacc1599 [AArch64] Constant fold sve_convert_from_svbool(zero) to zero
Co-authored-by: Paul Walker <paul.walker@arm.com>

Differential Revision: https://reviews.llvm.org/D100463
2021-04-20 10:02:49 +00:00
Joe Ellis c91cd4f3bb [AArch64][SVE][InstCombine] Replace last{a,b} intrinsics with extracts...
when the predicate used by last{a,b} specifies a known vector length.

For example:
  aarch64_sve_lasta(VL1, D) -> extractelement(D, #1)
  aarch64_sve_lastb(VL1, D) -> extractelement(D, #0)

Co-authored-by: Paul Walker <paul.walker@arm.com>

Differential Revision: https://reviews.llvm.org/D100476
2021-04-20 10:01:33 +00:00