Commit Graph

61268 Commits

Author SHA1 Message Date
Philip Reames 03a979a45a [Tests] Rename tests before adding new ones
llvm-svn: 360092
2019-05-06 22:16:55 +00:00
Philip Reames 4bcf10fc0f [Tests] Autogen a test in advance of updates
llvm-svn: 360091
2019-05-06 22:12:07 +00:00
Philip Reames 2f53d79bff Fix pr33010, a 2 year old crashing regression
The problem was that we were creating a CMOV64rr <TargetFrameIndex>, <TargetFrameIndex>.  The entire point of a TFI is that address code is not generated, so there's no way to legalize/lower this.  Instead, simply prevent it's creation.

Arguably, we shouldn't be using *Target*FrameIndices in StatepointLowering at all, but that's a much deeper change.  

llvm-svn: 360090
2019-05-06 22:09:31 +00:00
Craig Topper 77e69d8850 [X86] Add more test cases for fast-isel handling of fneg.
The fneg double case is falling back to a subsd in 32-bit mode if you write a test that doesn't trigger a fast-isel abort on the return value.

The subsd lowering has different behavior with respect to nans than using an xor. This is inconsisent with what we would do in SelectionDAG
and can lead to differences between -O0 and -O2.

llvm-svn: 360088
2019-05-06 22:04:26 +00:00
Stanislav Mekhanoshin 1bc001dec4 [AMDGPU] gfx1010 memory legalizer
Differential Revision: https://reviews.llvm.org/D61535

llvm-svn: 360087
2019-05-06 21:57:02 +00:00
Jordan Rupprecht 8f14e7cacf Revert "Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)"
This reverts r357452 (git commit 21eb771dcb).

This was causing strange optimization-related test failures on an internal test. Will followup with more details offline.

llvm-svn: 360086
2019-05-06 21:55:05 +00:00
Craig Topper d10a200ceb [X86] Remove the suffix on vcvt[u]si2ss/sd register variants in assembly printing.
We require d/q suffixes on the memory form of these instructions to disambiguate the memory size.
We don't require it on the register forms, but need to support parsing both with and without it.

Previously we always printed the d/q suffix on the register forms, but it's redundant and
inconsistent with gcc and objdump.

After this patch we should support the d/q for parsing, but not print it when its unneeded.

llvm-svn: 360085
2019-05-06 21:39:51 +00:00
Martin Storsjo 899f3cd581 [AArch64] Default to SEH exception handling on MinGW
The SEH implementation is pretty mature at this point.

Differential Revision: https://reviews.llvm.org/D61590

llvm-svn: 360080
2019-05-06 21:18:15 +00:00
Sanjay Patel a6019d5164 [InstCombine] sink FP negation of operands through select
We don't always get this:

Cond ? -X : -Y --> -(Cond ? X : Y)

...even with the legacy IR form of fneg in the case with extra uses,
and we miss matching with the newer 'fneg' instruction because we
are expecting binops through the rest of the path.

Differential Revision: https://reviews.llvm.org/D61604

llvm-svn: 360075
2019-05-06 20:34:05 +00:00
Craig Topper ad56843dd7 [SelectionDAG][X86] Support inline assembly returning an mmx register into a type with fewer than 64 bits.
It's possible to use the 'y' mmx constraint with a type narrower than 64-bits.

This patch supports this by bitcasting the mmx type to 64-bits and then
truncating to the desired type.

There are probably other missing type combinations we need to support, but this
is the case we have a bug report for.

Fixes PR41748.

Differential Revision: https://reviews.llvm.org/D61582

llvm-svn: 360069
2019-05-06 19:50:14 +00:00
Amara Emerson 3d1128cc9e [GlobalISel] Handle <1 x T> vector return types properly.
After support for dealing with types that need to be extended in some way was
added in r358032 we didn't correctly handle <1 x T> return types. These types
don't have a GISel direct representation, instead we just see them as scalars.
When we need to pad them into <2 x T> types however we need to use a
G_BUILD_VECTOR instead of trying to do a G_CONCAT_VECTOR.

This fixes PR41738.

llvm-svn: 360068
2019-05-06 19:41:01 +00:00
Craig Topper 55a71b575c Revert r359392 and r358887
Reverts "[X86] Remove (V)MOV64toSDrr/m and (V)MOVDI2SSrr/m. Use 128-bit result MOVD/MOVQ and COPY_TO_REGCLASS instead"
Reverts "[TargetLowering][AMDGPU][X86] Improve SimplifyDemandedBits bitcast handling"

Eric Christopher and Jorge Gorbe Moya reported some issues with these patches to me off list.

Removing the CodeGenOnly instructions has changed how fneg is handled during fast-isel with sse/sse2. We're now emitting fsub -0.0, x instead
moving to the integer domain(in a GPR), xoring the sign bit, and then moving back to xmm. This is because the fast isel table no longer
contains an entry for (f32/f64 bitcast (i32/i64)) so the target independent fneg code fails. The use of fsub changes the behavior of nan with
respect to -O2 codegen which will always use a pxor. NOTE: We still have a difference with double with -m32 since the move to GPR doesn't work
there. I'll file a separate PR for that and add test cases.

Since removing the CodeGenOnly instructions was fixing PR41619, I'm reverting r358887 which exposed that PR. Though I wouldn't be surprised
if that bug can still be hit independent of that.

This should hopefully get Google back to green. I'll work with Simon and other X86 folks to figure out how to move forward again.

llvm-svn: 360066
2019-05-06 19:29:24 +00:00
Paul Robinson 1e18bfe892 Fix more Windows bots after r360015.
Depending on the environment, the directory separator might
appear as \ or \\ on different bots.

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/17446/steps/test-check-all/logs/stdio

llvm-svn: 360065
2019-05-06 19:12:25 +00:00
Sanjay Patel 473dbf0301 [InstCombine] add tests for fneg+sel; NFC
llvm-svn: 360058
2019-05-06 17:29:22 +00:00
Nikita Popov cfe786a195 [SDAG][AArch64] Boolean and/or reduce to umax/min reduce (PR41635)
This addresses one half of https://bugs.llvm.org/show_bug.cgi?id=41635
by combining a VECREDUCE_AND/OR into VECREDUCE_UMIN/UMAX (if latter is
legal but former is not) for zero-or-all-ones boolean reductions (which
are detected based on sign bits).

Differential Revision: https://reviews.llvm.org/D61398

llvm-svn: 360054
2019-05-06 16:17:17 +00:00
Cameron McInally c3167696bc Add FNeg support to InstructionSimplify
Differential Revision: https://reviews.llvm.org/D61573

llvm-svn: 360053
2019-05-06 16:05:10 +00:00
Sanjay Patel 3379fb599d [InstCombine] regenerate test checks; NFC
llvm-svn: 360052
2019-05-06 16:03:53 +00:00
Nemanja Ivanovic 70afe4f7e1 [PowerPC] Fix erroneous condition for converting uint-to-fp vector conversion
A condition for exiting the legalization of v4i32 conversion to v2f64 through
extract/convert/build erroneously checks for the extract having type i32.
This is not adequate as smaller extracts are actually legalized to i32 as well.
Furthermore, an early exit is missing which means that we only check that
both extracts are from the same vector if that check fails.
As a result, both cases in the included test case fail - the first gets a
select error and the second generates incorrect code.

The culprit commit is r274535.

llvm-svn: 360043
2019-05-06 13:35:49 +00:00
Fangrui Song abb066c3f9 [test] Remove redundant bracket in rL360035
llvm-svn: 360036
2019-05-06 11:43:19 +00:00
Fangrui Song a79ec7b0b2 Try fix Windows bot after rL360015
llvm-svn: 360035
2019-05-06 11:39:49 +00:00
Fangrui Song 39a0a99330 Try fix Windows bot after rL360015
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/25599/steps/test/logs/stdio

llvm-svn: 360034
2019-05-06 11:37:20 +00:00
Fangrui Song 4c3d579096 [CodeGen] Move X86 tests under the X86 directory
llvm-svn: 360029
2019-05-06 10:21:17 +00:00
Guillaume Chatelet 3cfb48b877 [NFC] Update memcpy tests
Summary: Runs utils/update_llc_test_checks.py on a few memcpy files

Reviewers: courbet

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61507

Remove cfi noise by adding nounwind

llvm-svn: 360023
2019-05-06 09:46:50 +00:00
Fangrui Song 041c377a59 [X86] Move files to correct directories after D60552
llvm-svn: 360022
2019-05-06 09:24:36 +00:00
Clement Courbet 9e1f2a7fe7 [SimplifyLibCalls] Simplify bcmp too.
Summary: Fixes PR40699.

Reviewers: gchatelet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61585

llvm-svn: 360021
2019-05-06 09:15:22 +00:00
Pengfei Wang b5d3430d3d [NFC] This is a test for the commit access.
Summary: Signed-off-by: Pengfei Wang <pengfei.wang@intel.com>

Reviewers: LuoYuanke

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61580

llvm-svn: 360019
2019-05-06 08:31:18 +00:00
Fangrui Song 7e55672b22 DWARF v5: fix directory index in the line table
Summary:
Prior to DWARF v5, a directory index of 0 represents DW_AT_comp_dir.

In DWARF v5, the index starts with 0 and Entry.DirIdx is the index into
Prologue.IncludeDirectories.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D61253

llvm-svn: 360015
2019-05-06 08:03:46 +00:00
Markus Lavin a778074165 [DebugInfo] GlobalOpt DW_OP_deref_size instead of DW_OP_deref.
Optimization pass lib/Transforms/IPO/GlobalOpt.cpp needs to insert
DW_OP_deref_size instead of DW_OP_deref to be compatible with big-endian
targets for same reasons as in D59687.

Differential Revision: https://reviews.llvm.org/D60611

llvm-svn: 360013
2019-05-06 07:20:56 +00:00
Cameron McInally 1c34db85e5 Precommit an FNeg InstructionSimplify test.
llvm-svn: 359990
2019-05-05 18:22:09 +00:00
Cameron McInally 1d0c845d9d Add FNeg IR constant folding support
llvm-svn: 359982
2019-05-05 16:07:09 +00:00
Cameron McInally fd254e429e Add InstCombine tests for FNeg instruction.
llvm-svn: 359970
2019-05-04 14:56:08 +00:00
Sanjay Patel 5ab41a7a05 [CodeGenPrepare] limit overflow intrinsic matching to a single basic block (2nd try)
This is a subset of the original commit from rL359879
which was reverted because it could crash when using the 'RemovedInstructions'
structure that enables delayed deletion of dead instructions. The motivating
compile-time win does not require that change though. We should get most of
that win from this change alone.

Using/updating a dominator tree to match math overflow patterns may be very
expensive in compile-time (because of the way CGP uses a DT), so just handle
the single-block case.

See post-commit thread for rL354298 for more details:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html

Differential Revision: https://reviews.llvm.org/D61075

llvm-svn: 359969
2019-05-04 12:46:32 +00:00
Fangrui Song 08b28ce2f2 [llvm-nm] Convert weak.test to use yaml2obj and fix unntested 'v'
This restores part of the good change reverted by r359830.

llvm-svn: 359965
2019-05-04 09:12:18 +00:00
Stanislav Mekhanoshin 51d1415a16 AMDGPU] gfx1010 hazard recognizer
Differential Revision: https://reviews.llvm.org/D61536

llvm-svn: 359961
2019-05-04 04:30:57 +00:00
Stanislav Mekhanoshin 28a1936f6d [AMDGPU] gfx1010: use fmac instructions
Differential Revision: https://reviews.llvm.org/D61527

llvm-svn: 359959
2019-05-04 04:20:37 +00:00
Sanjay Patel dd2e91a181 [x86] add tests for fneg IR with undef; NFC
llvm-svn: 359941
2019-05-03 22:47:29 +00:00
Jessica Paquette 910630c1e4 [AArch64][GlobalISel] Use fcsel instead of csel for G_SELECT on FPRs
This saves us some unnecessary copies.

If the inputs to a G_SELECT are floating point, we should use fcsel rather than
csel.

Changes here are...

- Teach selectCopy about s1-to-s1 copies across register banks.
- AArch64RegisterBankInfo about G_SELECT in general.
- Teach the instruction selector about the FCSEL instructions.

Also add two tests:

- select-select.mir to show that we get the expected FCSEL
- regbank-select.mir (unfortunately named) to show the register banks on
G_SELECT are properly preserved

And update fast-isel-select.ll to show that we do the same thing as other
instruction selectors in these cases.

llvm-svn: 359940
2019-05-03 22:37:46 +00:00
Stanislav Mekhanoshin d9dcf392c7 [AMDGPU] gfx1010 wait count insertion
Differential Revision: https://reviews.llvm.org/D61534

llvm-svn: 359938
2019-05-03 21:53:53 +00:00
Stanislav Mekhanoshin 41bbe101a2 [AMDGPU] gfx1010 s_code_end generation
Also add some missing metadata in the streamer.

Differential Revision: https://reviews.llvm.org/D61531

llvm-svn: 359937
2019-05-03 21:26:39 +00:00
Mandeep Singh Grang 5dc8aeb26d [COFF, ARM64] Fix ABI implementation of struct returns
Summary:
Refer the ABI doc at: https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=vs-2019#return-values

Related clang patch: D60349

Reviewers: rnk, efriedma, TomTan, ssijaric

Reviewed By: rnk, efriedma

Subscribers: mstorsjo, javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60348

llvm-svn: 359934
2019-05-03 21:12:36 +00:00
Matt Arsenault b6c599afd3 Reapply r359906, "RegAllocFast: Add heuristic to detect values not live-out of a block"
This reverts commit r359912.

This should pass now, since the clang test was made less fragile in
r359918.

llvm-svn: 359919
2019-05-03 19:06:57 +00:00
Don Hinton f6eac2dd3b [CommandLine] Enable Grouping for short options by default. Part 4 of 5
Summary:
This change enables `cl::Grouping` for short options --
options with names of a single character.  This is consistent with GNU
getopt behavior.

Reviewers: rnk, MaskRay

Reviewed By: MaskRay

Subscribers: thopre, cfe-commits, MaskRay, rupprecht, hiraditya, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D61270

llvm-svn: 359917
2019-05-03 18:56:25 +00:00
Nico Weber bb852a9672 Revert r359906, "RegAllocFast: Add heuristic to detect values not live-out of a block"
Makes clang/test/Misc/backend-stack-frame-diagnostics-fallback.cpp fail.

llvm-svn: 359912
2019-05-03 18:08:03 +00:00
Don Hinton c242be40a1 [CommandLine] Change help output to prefix long options with `--` instead of `-`. NFC . Part 3 of 5
Summary:
By default, `parseCommandLineOptions()` will accept either a
`-` or `--` prefix for long options -- options with names longer than
a single character.

While this change does not affect behavior, it will be helpful with a
subsequent change that requires long options use the `--` prefix.

Reviewers: rnk, thopre

Reviewed By: thopre

Subscribers: thopre, cfe-commits, hiraditya, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D61269

llvm-svn: 359909
2019-05-03 17:47:29 +00:00
Evgeniy Stepanov 46ec57e576 Revert "[CodeGenPrepare] limit overflow intrinsic matching to a single basic block"
This reverts commit r359879, which introduced a compiler crash.

llvm-svn: 359908
2019-05-03 17:31:49 +00:00
Matt Arsenault daf2d653fa RegAllocFast: Add heuristic to detect values not live-out of a block
Add an improved/new heuristic to catch more cases when values are not
live out of a basic block.

Patch by Matthias Braun

llvm-svn: 359906
2019-05-03 17:03:24 +00:00
Brian Cain 3428c9daef [hexagon] change AsmParser assertion to error
For immediates that can't be evaluated in assembler-mapped instructions, we
should return 'invalid operand' instead of assert.

llvm-svn: 359905
2019-05-03 16:50:38 +00:00
Craig Topper a8f3840c62 [X86] Allow assembly parser to accept x/y/z suffixes on non-memory vfpclassps/pd and on memory forms in intel syntax
The x/y/z suffix is needed to disambiguate the memory form in at&t syntax since no xmm/ymm/zmm register is mentioned.

But we should also allow it for the register and broadcast forms where its not needed for consistency. This matches gas.

The printing code will still only use the suffix for the memory form where it is needed.

llvm-svn: 359903
2019-05-03 16:15:15 +00:00
Matt Arsenault 657ef48a88 AMDGPU: Select VOP3 form of sub
The VOP3 form should always be the preferred selection form to be
shrunk later.

The r600 sub test needs to be split out because it asserts on the
arguments in the new test during the calling convention lowering.

llvm-svn: 359899
2019-05-03 15:37:07 +00:00
Matt Arsenault cfd0ca38b0 AMDGPU: Support shrinking add with FI in SIFoldOperands
Avoids test regression in a future patch

llvm-svn: 359898
2019-05-03 15:21:53 +00:00
Robert Lougher e28ab93546 Revert r359549 - incorrect update of test checks. NFC
llvm-svn: 359897
2019-05-03 15:14:19 +00:00
Sanjay Patel d0336b1e3f [x86] add tests for fneg with undefs; NFC
This was originally part of D61419.

llvm-svn: 359896
2019-05-03 15:09:53 +00:00
Matt Arsenault ca7a582bf3 AMDGPU: Add baseline test for future patch
llvm-svn: 359893
2019-05-03 14:54:38 +00:00
Matt Arsenault 0446fbe45e AMDGPU: Replace shrunk instruction with dummy implicit_def
This was broken if the original operand was killed. The kill flag
would appear on both instructions, and fail the verifier. Keep the
kill flag, but remove the operands from the old instruction. This has
an added benefit of really reducing the use count for future folds.

Ideally the pass would be structured more like what PeepholeOptimizer
does to avoid this hack to avoid breaking instruction iterators.

llvm-svn: 359891
2019-05-03 14:40:10 +00:00
Sid Manning 5ad18a7d59 Let --discard-all imply --strip-debug.
This will match gnu strip's behavior.

Differential Revision: https://reviews.llvm.org/D61092

llvm-svn: 359887
2019-05-03 14:14:01 +00:00
Simon Pilgrim 4d4f779fa2 [X86] Add X64 common prefixes and regenerate mul i64 tests
Noticed while reviewing D61472

llvm-svn: 359886
2019-05-03 14:07:38 +00:00
Matt Arsenault 6d0c59605c AMDGPU: Forgot to commit test file for r358890
llvm-svn: 359885
2019-05-03 13:55:40 +00:00
Matt Arsenault 2c8936fd26 AMDGPU: Fix incorrect commute with sub when folding immediates
When a fold of an immediate into a sub/subrev required shrinking the
instruction, the wrong VOP2 opcode was used. This was using the VOP2
equivalent of the original instruction, not the commuted instruction
with the inverted opcode.

llvm-svn: 359883
2019-05-03 13:42:56 +00:00
Matt Arsenault 2636460f0e AMDGPU: Fix test verification
This should run the verifier, and needs to enable trackRegLiveness.

llvm-svn: 359882
2019-05-03 13:42:55 +00:00
Sanjay Patel d3cfaae243 [LICM] auto-generate complete test checks; NFC
llvm-svn: 359881
2019-05-03 13:25:06 +00:00
Sanjay Patel 8ff072e48e [CodeGenPrepare] limit overflow intrinsic matching to a single basic block
Using/updating a dominator tree to match math overflow patterns may be very
expensive in compile-time (because of the way CGP uses a DT), so just handle
the single-block case.

Also, we were restarting the iterator loops when doing the overflow intrinsic
transforms by marking the dominator tree for update. That was done to prevent
iterating over a removed instruction. But we can postpone the deletion using
the existing "RemovedInsts" structure, and that means we don't need to update
the DT.

See post-commit thread for rL354298 for more details:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html

Differential Revision: https://reviews.llvm.org/D61075

llvm-svn: 359879
2019-05-03 13:09:18 +00:00
Sean Fertile fd75ee9154 [Object][XCOFF] Add an XCOFF dumper for llvm-readobj.
Patch adds support for dumping of file headers with llvm-readobj. XCOFF
object files are added to test dumping a well formed file, and dumping
both negative timestamps and negative symbol counts, both of which are
allowed in the XCOFF definition.

Differential Revision: https://reviews.llvm.org/D60878

llvm-svn: 359878
2019-05-03 12:57:07 +00:00
Anton Afanasyev 6d08b8dbae Revert "[MIR] Add simple PRE pass to MachineCSE"
This reverts commit 9c20156de3.
It breaks stage 2 of clang-ppc64be-linux-multistage.

llvm-svn: 359875
2019-05-03 12:36:22 +00:00
Anton Afanasyev 9c20156de3 [MIR] Add simple PRE pass to MachineCSE
This is the second part of the commit fixing PR38917 (hoisting
partitially redundant machine instruction). Most of PRE (partitial
redundancy elimination) and CSE work is done on LLVM IR, but some of
redundancy arises during DAG legalization. Machine CSE is not enough
to deal with it. This simple PRE implementation works a little bit
intricately: it passes before CSE, looking for partitial redundancy
and transforming it to fully redundancy, anticipating that the next
CSE step will eliminate this created redundancy. If CSE doesn't
eliminate this, than created instruction will remain dead and eliminated
later by Remove Dead Machine Instructions pass.

The third part of the commit is supposed to refactor MachineCSE,
to make it more clear and to merge MachinePRE with MachineCSE,
so one need no rely on further Remove Dead pass to clear instrs
not eliminated by CSE.

First step: https://reviews.llvm.org/D54839

Fixes llvm.org/PR38917

Reviewers: RKSimon

Subscribers: hfinkel, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D56772

llvm-svn: 359870
2019-05-03 10:30:59 +00:00
Quentin Colombet c9256cc6ba [IRTranslator] Use the alloc size instead of the store size when translating allocas
We use to incorrectly use the store size instead of the alloc size when
creating the stack slot for allocas.
On aarch64 this can be demonstrated by allocating weirdly sized types.

For instance, in the added test case, we use an alloca for i19. We used
to allocate a slot of size 24-bit (19 rounded up to the next byte),
whereas we really want to use a full 32-bit slot for this type.

llvm-svn: 359856
2019-05-03 01:23:56 +00:00
Eli Friedman 7238353848 [AArch64][MC] Reject "add x0, x1, w2, lsl #1" etc.
Looks like just a minor oversight in the parsing code.

Fixes https://bugs.llvm.org/show_bug.cgi?id=41504.

Differential Revision: https://reviews.llvm.org/D60840

llvm-svn: 359855
2019-05-03 00:59:52 +00:00
Eli Friedman 0b61d220c9 [AArch64][Windows] Compute function length correctly in unwind tables.
The primary fix here is to WinException.cpp: we need to exclude jump
tables when computing the length of a function, or else we fail to
correctly compute the length. (We can only compute the number of bytes
consumed by certain assembler directives after the entire file is
parsed. ".p2align" is one of those directives, and is used by jump table
generation.)

The secondary fix, to MCWin64EH, is to make sure we don't silently
miscompile if we hit a similar situation in the future.

It's possible we could extend ARM64EmitUnwindInfo so it allows function
bodies that contain assembler directives, but that's a lot more
complicated; see the FIXME in MCWin64EH.cpp.

Fixes https://bugs.llvm.org/show_bug.cgi?id=41581 .

Differential Revision: https://reviews.llvm.org/D61095

llvm-svn: 359849
2019-05-03 00:10:45 +00:00
Alina Sbirlea 0363c3b8bb [MemorySSA] Check that block is reachable when adding phis.
Summary:
Originally the insertDef method was only used when building MemorySSA, and was limiting the number of Phi nodes that it created.
Now it's used for updates as well, and it can create additional Phis needed for correctness.
Make sure no Phis are created in unreachable blocks (condition met during MSSA build), otherwise the renamePass will find a null DTNode.

Resolves PR41640.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61410

llvm-svn: 359845
2019-05-02 23:41:58 +00:00
Craig Topper e1e38d4248 [X86] Correct the register class for specific mask register constraints in getRegForInlineAsmConstraint when the VT is a scalar type
The default impementation in the base class for TargetLowering::getRegForInlineAsmConstraint doesn't work for mask registers when the VT is a scalar type integer types since the only legal mask types are vXi1. So we end up just getting whatever the first register class that contains the register. Currently this appears to be VK1, but its really dependent on the order tablegen outputs the register classes.

Some code in the caller ends up looking up the type for this register class and find v1i1 then generates a copyfromreg from the physical k-register with the v1i1 type. Then it generates an any_extend from v1i1 to the scalar VT which isn't legal. This bad any_extend sticks around until isel where it selects a MOVZX32rr8 with a v1i1 input or maybe a i8 input. Not sure but eventually we pick up a copy from VK1 to GR8 in MachineIR which isn't supported. This leads to a failure in physical register copying.

This patch uses the scalar type to find a VK class of the right size. In the attached test case this will be VK16. This causes a bitcast from vk16 to i16 to be generated instead of an any_extend. This will be properly iseled to a VK16 to GR32 copy and a GR32->GR16 extract_subreg.

Fixes PR41678

Differential Revision: https://reviews.llvm.org/D61453

llvm-svn: 359837
2019-05-02 22:26:40 +00:00
Jordan Rupprecht 8ab9d5a8ed Revert [ThinLTO] Fix X86/strong_non_prevailing.ll after llvm-nm 'r' change
This reverts r359314 (git commit 5015aa854d)

llvm-svn: 359831
2019-05-02 21:48:04 +00:00
Jordan Rupprecht 51a1418768 Revert [llvm-nm] Fix handling of symbol types + [llvm-nm] Generalize symbol types
This reverts r359311 and r359312 (git commit 0bf06a8f59 and 5f184f1780)

llvm-svn: 359830
2019-05-02 21:42:46 +00:00
Sanjay Patel 1972826178 [DAGCombiner] try repeated fdiv divisor transform before building estimate (2nd try)
The original patch was committed at rL359398 and reverted at rL359695 because of
infinite looping.

This includes a fix to check for a vector splat of "1.0" to avoid the infinite loop.

Original commit message:

This was originally part of D61028, but it's an independent diff.

If we try the repeated divisor reciprocal transform before producing an estimate sequence,
then we have an opportunity to use scalar fdiv. On x86, the trade-off is 1 divss vs. 5
vector FP ops in the default estimate sequence. On recent chips (Skylake, Ryzen), the
full-precision division is only 3 cycle throughput, so that's probably the better perf
default option and avoids problems from x86's inaccurate estimates.

The last 2 tests show that users still have the option to override the defaults by using
the function attributes for reciprocal estimates, but those patterns are potentially made
faster by converting the vector ops (including ymm ops) to scalar math.

Differential Revision: https://reviews.llvm.org/D61149

llvm-svn: 359793
2019-05-02 15:02:08 +00:00
Sanjay Patel 284472be6d [SelectionDAG] remove constant folding limitations based on FP exceptions
We don't have FP exception limits in the IR constant folder for the binops (apart from strict ops),
so it does not make sense to have them here in the DAG either. Nothing else in the backend tries
to preserve exceptions (again outside of strict ops), so I don't see how this could have ever
worked for real code that cares about FP exceptions.

There are still cases (examples: unary opcodes in SDAG, FMA in IR) where we are trying (at least
partially) to preserve exceptions without even asking if the target supports FP exceptions. Those
should be corrected in subsequent patches.

Real support for FP exceptions requires several changes to handle the constrained/strict FP ops.

Differential Revision: https://reviews.llvm.org/D61331

llvm-svn: 359791
2019-05-02 14:47:59 +00:00
Simon Pilgrim df8daf0ef4 [X86][SSE] lowerAddSubToHorizontalOp - enable ymm extraction+fold
Limiting scalar hadd/hsub generation to the lowest xmm looks to be unnecessary - we will be extracting one upper xmm whatever, and we can remove a shuffle by using the hop which is inline with what shouldUseHorizontalOp expects to happen anyway.

Testing on btver2 (the main target for fast-hops) shows this is beneficial even for float ops where we have a 'shuffle' to extract the float result:
https://godbolt.org/z/0R-U-K

Differential Revision: https://reviews.llvm.org/D61426

llvm-svn: 359786
2019-05-02 14:00:55 +00:00
James Henderson e4a89a1bee [llvm-strip]Add --no-strip-all to disable --strip-all behaviour (including default stripping)
If certain switches are not specified, llvm-strip behaves as if
--strip-all were specified. This means that for testing, when we don't
want the stripping behaviour, we have to specify one of these switches,
which can be confusing. This change adds --no-strip-all to allow an
alternative way of suppressing the default stripping, in a less
confusing manner.

Reviewed by: jakehehrlich, MaskRay

Differential Revision: https://reviews.llvm.org/D61377

llvm-svn: 359781
2019-05-02 11:53:02 +00:00
Diana Picus 06a61ccc42 [ARM GlobalISel] Select extensions to < 32 bits
Select G_SEXT and G_ZEXT with destination types smaller than 32 bits in
the exact same way as 32 bits. This overwrites the higher bits, but that
should be ok since all legal users of types smaller than 32 bits ignore
those bits anyway.

llvm-svn: 359768
2019-05-02 09:28:00 +00:00
Diana Picus 7da389818d [ARM GlobalISel] Rename some inst selector tests. NFC
Prepare to add support for extensions to types smaller than 32 bits.

llvm-svn: 359767
2019-05-02 09:24:47 +00:00
Diana Picus 53bcf6f2e7 [ARM GlobalISel] Legalize extensions to < 32 bits
Make it legal to extend from e.g. s1 to s8 or s16.

llvm-svn: 359766
2019-05-02 09:21:46 +00:00
Stanislav Mekhanoshin 64399da8b8 [AMDGPU] gfx1010 lost VOP2 forms of some add/sub
Add legalization of V_ADD_I32, V_SUB_I32, V_SUBREV_I32.

Differential Revision:

llvm-svn: 359757
2019-05-02 04:26:35 +00:00
Stanislav Mekhanoshin 5cf8167735 [AMDGPU] gfx1010 allows VOP3 to have a literal
Differential Revision: https://reviews.llvm.org/D61413

llvm-svn: 359756
2019-05-02 04:01:39 +00:00
Craig Topper b929a0062e [X86] Remove the redundant suffix in vfpclassp[d,s]'s broadcasting variant
The broadcasting variant for instruction vfpclassp[d,s] shouldn't use suffix q/l. So remove them from the template.

Patch by Pengfei Wang

Differential Revision: https://reviews.llvm.org/D61295

llvm-svn: 359753
2019-05-02 03:25:50 +00:00
Bob Haarman a78ab77b6b remove inalloca parameters in globalopt and simplify argpromotion
Summary:
Inalloca parameters require special handling in some optimizations.
This change causes globalopt to strip the inalloca attribute from
function parameters when it is safe to do so, removes the special
handling for inallocas from argpromotion, and replaces it with a
simple check that causes argpromotion to skip functions that receive
inallocas (for when the pass is invoked on code that didn't run
through globalopt first). This also avoids a case where argpromotion
would incorrectly try to pass an inalloca in a register.

Fixes PR41658.

Reviewers: rnk, efriedma

Reviewed By: rnk

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61286

llvm-svn: 359743
2019-05-02 00:37:36 +00:00
Thomas Preud'homme 1feaee52ff [FileCheck] Fix line-count.txt test
Summary: Enable currently skipped diagnostic test and fix column number

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61320

llvm-svn: 359742
2019-05-02 00:04:44 +00:00
Thomas Preud'homme 288ed91e99 FileCheck [4/12]: Introduce @LINE numeric expressions
Summary:
This patch is part of a patch series to add support for FileCheck
numeric expressions. This specific patch introduces the @LINE numeric
expressions.

This commit introduces a new syntax to express a relation a numeric
value in the input text must have with the line number of a given CHECK
pattern: [[#<@LINE numeric expression>]]. Further commits build on that
to express relations between several numeric values in the input text.
To help with naming, regular variables are renamed into pattern
variables and old @LINE expression syntax is referred to as legacy
numeric expression.

Compared to existing @LINE expressions, this new syntax allow arbitrary
spacing between the component of the expression. It offers otherwise the
same functionality but the commit serves to introduce some of the data
structure needed to support more general numeric expressions.

Copyright:
    - Linaro (changes up to diff 183612 of revision D55940)
    - GraphCore (changes in later versions of revision D55940 and
                 in new revision created off D55940)

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60384

llvm-svn: 359741
2019-05-02 00:04:38 +00:00
Jessica Paquette d010a3b63e Fix erroneous flag in GISel line for arm64-fast-isel-materialize.ll
Accidentally put a fast-isel-abort=2 instead of the GISel abort line.

This test doesn't actually fall back at all for GISel though, so remove the
fallback checks entirely.

llvm-svn: 359737
2019-05-01 22:50:11 +00:00
Hiroshi Yamauchi 1620104034 [PGO][CHR] A bug fix.
Summary: Fix a transformation bug where two scopes share a common instrution to hoist.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61405

llvm-svn: 359736
2019-05-01 22:49:52 +00:00
Jessica Paquette a3843fe6f4 [GlobalISel][AArch64] Use fmov for G_FCONSTANT when possible
This adds support for using fmov rather than a standard mov to materialize
G_FCONSTANT when it's safe to do so.

Update arm64-fast-isel-materialize.ll and select-constant.mir to show that the
selection is correct.

llvm-svn: 359734
2019-05-01 22:39:43 +00:00
Nikita Popov c89667db2c [AArch64] Add tests for bool vector reductions; NFC
Baseline tests for PR41635.

llvm-svn: 359723
2019-05-01 20:18:36 +00:00
Sanjay Patel 9f68614494 [PowerPC] add test that could infinite loop with reordered transforms; NFC
This is a slightly reduced version of the test from D61384.
Adding this as a preliminary step, so I can update D61149 with
the proposed fix.

llvm-svn: 359709
2019-05-01 17:34:30 +00:00
Simon Pilgrim 9f04d97cd7 [X86][SSE] Fold scalar horizontal add/sub for non-0/1 element extractions
We already perform horizontal add/sub if we extract from elements 0 and 1, this patch extends it to non-0/1 element extraction indices (as long as they are from the lowest 128-bit vector).

Differential Revision: https://reviews.llvm.org/D61263

llvm-svn: 359707
2019-05-01 17:13:35 +00:00
Stanislav Mekhanoshin 3b7925f035 [AMDGPU] gfx1010 GCNRegBankReassign pass
Reassign registers to reduce register bank conflicts.

Differential Revision: https://reviews.llvm.org/D61344

llvm-svn: 359704
2019-05-01 16:49:31 +00:00
Stanislav Mekhanoshin c29d491596 [AMDGPU] gfx1010 GCNNSAReassign pass
Convert NSA into non-NSA images.

Differential Revision: https://reviews.llvm.org/D61341

llvm-svn: 359700
2019-05-01 16:40:49 +00:00
Stanislav Mekhanoshin 692560dc98 [AMDGPU] gfx1010 MIMG implementation
Differential Revision: https://reviews.llvm.org/D61339

llvm-svn: 359698
2019-05-01 16:32:58 +00:00
Teresa Johnson b3203ec078 [ThinLTO] Fix unreachable code when parsing summary entries.
Summary:
Early returns were causing some code to be skipped. This was missed
since the summary entries are typically at the end of the llvm assembly
file.

Fixes PR41663.

Reviewers: RKSimon, wristow

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61355

llvm-svn: 359697
2019-05-01 16:26:59 +00:00
Stanislav Mekhanoshin a224f68a10 [AMDGPU] gfx1010 DS implementation
Differential Revision: https://reviews.llvm.org/D61332

llvm-svn: 359696
2019-05-01 16:11:11 +00:00
Sanjay Patel 64d5751254 Revert "[DAGCombiner] try repeated fdiv divisor transform before building estimate"
This reverts commit fb9a5307a9 (rL359398)
because it can cause an infinite loop due to opposing combines.

llvm-svn: 359695
2019-05-01 16:06:21 +00:00
Hubert Tong 02d055a269 [tests] Add host-byteorder-*-endian; update XFAILs of big-endian triples
Summary:
Triple components in `XFAIL` lines are tested against the target triple.
Various tests that are expected to fail on big-endian hosts are marked
as being `XFAIL` for big-endian targets. This patch corrects these tests
by having them test against a new `host-byteorder-big-endian` feature.

Reviewers: xingxue, sfertile, jasonliu

Reviewed By: xingxue

Subscribers: jvesely, nhaehnle, fedor.sergeev, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60551

llvm-svn: 359689
2019-05-01 15:36:18 +00:00
Fangrui Song 9caa6b5b64 [llvm-ar][llvm-nm][llvm-size] Change -long-option to --long-option in tests. NFC
llvm-svn: 359688
2019-05-01 15:31:15 +00:00
Simon Pilgrim 6711b9699a [X86][SSE] Add demanded elts support X86ISD::PMULDQ\PMULUDQ
Add to SimplifyDemandedVectorEltsForTargetNode and SimplifyDemandedBitsForTargetNode

llvm-svn: 359686
2019-05-01 14:50:50 +00:00
Simon Pilgrim 3d6899e369 [X86][SSE] Add SSE vector shift support to SimplifyDemandedVectorEltsForTargetNode vector splitting
llvm-svn: 359680
2019-05-01 13:51:09 +00:00