Commit Graph

192902 Commits

Author SHA1 Message Date
Wei Mi 3c96d01d2e Generate Callee Saved Register (CSR) related cfi directives like .cfi_restore.
https://reviews.llvm.org/D42848 only handled CFA related cfi directives but
didn't handle CSR related cfi. The patch adds the CSR part. Basically it reuses
the framework created in D42848. For each basicblock, the patch tracks which
CSR set have been saved at its CFG predecessors's exits, and compare the CSR
set with the set at its previous basicblock's exit (The previous block is the
block laid before the current block). If the saved CSR set at its previous
basicblock's exit is larger, .cfi_restore will be inserted.

The patch also generates proper .cfi_restore in epilogue to make sure the
saved CSR set is consistent for the incoming edges of each block.

Differential Revision: https://reviews.llvm.org/D74303
2020-03-04 11:18:37 -08:00
Guozhi Wei ee9a3eba76 [CodeGenPrepare] Handle ExtractValueInst in dupRetToEnableTailCallOpts
As the test case shows if there is an ExtractValueInst in the Ret block, function dupRetToEnableTailCallOpts can't duplicate it into the block containing call. So later no tail call is generated in CodeGen.

    This patch adds the ExtractValueInst handling code in function dupRetToEnableTailCallOpts and FoldReturnIntoUncondBranch, and later tail call can be generated for this case.

Differential Revision: https://reviews.llvm.org/D74242
2020-03-04 11:10:32 -08:00
David Green 38e532278e [LSR] Add masked load and store handling
This teaches Loop Strength Reduction the details about masked load and
store address operands, so that it can have a better time optimising
them as it would for normal loads and stores.

Differential Revision: https://reviews.llvm.org/D75371
2020-03-04 18:36:10 +00:00
Mitch Phillips 58079aa91b Revert "Fix GSYM tests to run the yaml files and fix test failures on some machines."
This reverts commit 8d41f1a023.

This change broke the MSan buildbots - see comments in
https://reviews.llvm.org/D75390 for more information.
2020-03-04 10:21:54 -08:00
Nikita Popov 293d813020 [InstCombine] Don't explicitly invoke const folding in shift combine
InstCombine uses an IRBuilder that automatically performs
target-dependent constant folding, so explicitly invoking it here
is not necessary.
2020-03-04 18:33:00 +01:00
Nikita Popov 9b5de84e27 [InstCombine] Use IRBuilder to create bitcast
This makes sure that the constant expression bitcast goes through
target-dependent constant folding, and thus avoids an additional
iteration of InstCombine.
2020-03-04 18:28:38 +01:00
Nikita Popov 17be8e4a6f [ConstProp] Add test for bitcast to gep fold; NFC 2020-03-04 18:27:20 +01:00
Nikita Popov a99b97b818 [InstSimplify] Add additional icmp of gep folding test; NFC 2020-03-04 18:27:01 +01:00
Nikita Popov 0940c32385 [InstSimplify] Regenerate compare.ll checks; NFC 2020-03-04 18:26:42 +01:00
Nikita Popov 0e890cd4d4 [ConstantFolding] Always return something from ConstantFoldConstant
Spin-off from D75407. As described there, ConstantFoldConstant()
currently returns null for non-ConstantExpr/ConstantVector inputs,
but otherwise always returns non-null, independently of whether
any folding has happened or not.

This is confusing and makes consumer code more complicated.
I would expect either that ConstantFoldConstant() returns only if
it actually folded something, or that it always returns non-null.
I'm going to the latter possibility here, which appears to be more
useful considering existing usage.

Differential Revision: https://reviews.llvm.org/D75543
2020-03-04 18:24:47 +01:00
Craig Topper 06de426426 [X86] Directly form VBROADCAST_LOAD in lowerShuffleAsBroadcast on AVX targets.
If we would emit a VBROADCAST node, we can instead directly emit
a VBROADCAST_LOAD. This allows us to get rid of the special case
to use an f64 load on 32-bit targets for vXi64.

I believe there is more cleanup we can do later in this function,
but I'll do that in follow ups.
2020-03-04 09:11:57 -08:00
Simon Pilgrim f24d90c0a6 [X86] Add tests showing failure to combine consecutive loads + FSHR into a single load
Similar to some of the regressions seen in D75114
2020-03-04 17:07:03 +00:00
Simon Pilgrim 4c411d2419 [X86] Add tests showing failure to combine consecutive loads + FSHL into a single load
Similar to some of the regressions seen in D75114
2020-03-04 17:07:02 +00:00
Raphael Isemann 8673def9c1 Fix modules build after MatrixBuilder patch
The addition of MatrixBuilder.h broke the modules build:
```
While building module 'LLVM_intrinsic_gen' imported from llvm-project/llvm/lib/IR/AbstractCallSite.cpp:19:
While building module 'LLVM_IR' imported from llvm-project/llvm/include/llvm/IR/Argument.h:19:
In file included from <module-includes>:6:
llvm-project/llvm/include/llvm/IR/MatrixBuilder.h:19:10: fatal error: cyclic dependency in module 'LLVM_intrinsic_gen': LLVM_intrinsic_gen -> LLVM_IR -> LLVM_intrinsic_gen
         ^
While building module 'LLVM_intrinsic_gen' imported from llvm-project/llvm/lib/IR/AbstractCallSite.cpp:19:
In file included from <module-includes>:1:
llvm-project/llvm/include/llvm/IR/Argument.h:19:10: fatal error: could not build module 'LLVM_IR'
 ~~~~~~~~^~~~~~~~~~~~~~~~~
llvm-project/llvm/lib/IR/AbstractCallSite.cpp:19:10: fatal error: could not build module 'LLVM_intrinsic_gen'
 ~~~~~~~~^~~~~~~~~~~~~~~~~~~~
```
2020-03-04 09:03:34 -08:00
Sanjay Patel 71a316883d [PassManager] adjust VectorCombine placement
The initial placement of vector-combine in the opt pipeline revealed phase ordering bugs:
https://bugs.llvm.org/show_bug.cgi?id=45015
https://bugs.llvm.org/show_bug.cgi?id=42022

This patch contains a few independent changes:

1. Move the pass up in the pipeline, so it happens just after loop-vectorization.
   This is only to keep vectorization passes together in the pipeline at the moment.
   I don't have evidence of interaction between these yet.
2. Add an -early-cse pass after -vector-combine to clean up redundant ops. This was
   partly proposed as far back as rL219644 (which is why it's effectively being moved
   in the old PM code). This is important because the subsequent -instcombine doesn't
   work as well without EarlyCSE. With the CSE, -instcombine is able to squash
   shuffles together in 1 of the tests (because those are simple "select" shuffles).
3. Remove the -vector-combine pass that was running after SLP. We may want to do that
   eventually, but I don't have a test case to support it yet.

Differential Revision: https://reviews.llvm.org/D75145
2020-03-04 11:10:49 -05:00
Sanjay Patel 29a2b20ab3 [SDAG] simplify FP binops to undef
As discussed in the commit thread for rGa253a2a and D73978, we can do more undef folding for FP ops.
The nnan and ninf fast-math-flags specify that if an operand is the disallowed value, the result is
poison, so we can produce an undef result.

But this doesn't work as expected (the undef operand cases remain) because of a Flags propagation
problem in SelectionDAGBuilder.

I've added DAGCombiner calls to enable these for the other cases because we've shown in other
patches that (because of the limited way that SDAG iterates), it is possible to miss simplifications
like this if they are done only at node creation time.

Several potential follow-ups to expand on this patch are possible.

Differential Revision: https://reviews.llvm.org/D75576
2020-03-04 10:42:16 -05:00
David Green 587feec07e [ARM] Change all tests from "thumbv8.1-m.main" to "thumbv8.1m.main". NFC 2020-03-04 13:47:35 +00:00
Evgeniy Brevnov e60c28746b Lost regression test from commit 5a63813dc7. 2020-03-04 19:52:42 +07:00
Pavel Labath bddab92858 Use new DWARFDataExtractor::getInitialLength in DWARFDebugFrame 2020-03-04 13:01:35 +01:00
Pavel Labath 2458492a9a Use new DWARFDataExtractor::getInitialLength in DWARFDebugPubTable 2020-03-04 13:01:35 +01:00
Pavel Labath c9579271b3 Use new DWARFDataExtractor::getInitialLength in DWARFUnit 2020-03-04 13:01:35 +01:00
Pavel Labath a8bc9c3f0f Use new DWARFDataExtractor::getInitialLength in DWARFVerifier 2020-03-04 13:01:34 +01:00
Pavel Labath eb2b17eea7 Use DWARFDataExtractor::getInitialLength in debug_aranges
Summary:
getInitialLength is a *DWARF*DataExtractor method so I had to "upgrade"
some DataExtractors to be able to make use of it.

Reviewers: ikudrin, jhenderson, probinson

Subscribers: aprantl, hiraditya, llvm-commits, dblaikie

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75535
2020-03-04 13:01:07 +01:00
Pavel Labath 38385630ad Use DWARFDataExtractor::getInitialLength in DWARFDebugAddr
Reviewers: ikudrin, jhenderson, probinson

Subscribers: hiraditya, dblaikie, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75532
2020-03-04 13:00:15 +01:00
Kerry McLaughlin f5502c7035 [AArch64][SVE] Add SVE2 intrinsic for xar
Summary: Implements the @llvm.aarch64.sve.xar intrinsic

Reviewers: andwar, c-rhodes, dancgr, efriedma, rengolin

Reviewed By: andwar

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75160
2020-03-04 11:44:32 +00:00
Evgeniy Brevnov 5a63813dc7 [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses(PR42151).
Summary:
This is second attempt to fix the problem with incorrect dependencies reported in presence of invariant load. Initial fix (https://reviews.llvm.org/D64405) was reverted due to a regression reported in https://reviews.llvm.org/D70516.

The original fix changed caching behavior for invariant loads. Namely such loads are not put into the second level cache (NonLocalDepInfo). The problem with that fix is the first level cache  (CachedNonLocalPointerInfo) still works as if invariant loads were in the second level cache. The solution is in addition to not putting dependence results into the second level cache avoid putting info about invariant loads into the first level cache as well.

Reviewers: jdoerfert, reames, hfinkel, efriedma

Reviewed By: jdoerfert

Subscribers: DaniilSuchkov, hiraditya, bmahjour, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73027
2020-03-04 18:40:02 +07:00
Simon Pilgrim e2f0093800 [AMDGPU] performCvtF32UByteNCombine - revisit node after src operand simplification.
If SimplifyDemandedBits succeeds in simplifying the byte src, add the CVT_F32_UBYTE node back to the worklist as we might be able to simplify further.

Yet another step towards removing SelectionDAG::GetDemandedBits.
2020-03-04 11:25:50 +00:00
Florian Hahn 2a70db245d [Matrix] Add IR MatrixBuilder.
This builder provides a convenient way for targets to lower various matrix
operations to LLVM IR, making use of matrix intrinsics where available.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D72280
2020-03-04 11:14:20 +00:00
gbreynoo 5e0f9d5d3c [llvm-ar][test] Add to llvm-ar test coverage
- Added handling of thin archives to symtab.test.
- Added handling of newlines to response.test.
- 62fa3332c9 exposed behaviour
  regarding the use of -- on the command line. Added
  double-hyphen.test to cover this.

Differential Revision: https://reviews.llvm.org/D73333
2020-03-04 10:56:48 +00:00
Georgii Rymar 1c991f907a [Object/ELF] - Fix the offset type used in ELFFile<ELFT>::getEntry().
We use size_t for a file offset what is wrong, because size_t is 32-bit
value on 32-bit platforms.

I was reported that after my 0b511c2302
"[llvm-readobj] - Report warnings instead of errors for broken relocations."

The following error is observed on 32-bit Arch Linux:

[100%] Running all regression tests
FAIL: LLVM :: tools/llvm-readobj/ELF/relocation-errors.test (52954 of 54768)
******************** TEST 'LLVM :: tools/llvm-readobj/ELF/relocation-errors.test' FAILED ***

...
llvm-project/llvm/test/tools/llvm-readobj/ELF/relocation-errors.test:9:14:error: LLVM-NEXT: expected string not found in input
# LLVM-NEXT: warning: '[[FILE]]': unable to print relocation 1 in section 3: unable to access section [index 6] data at 0x17e7e7e8b0: offset goes past the end of file
             ^
<stdin>:9:1: note: scanning from here
/llvm-project/build/bin/llvm-readobj: warning: 'llvm-project/build/test/tools/llvm-readobj/ELF/Output/relocation-errors.test.tmp64': unable to print relocation 1 in section 3: unable to access section [index 6] data at 0xe7e7e8b0: offset goes past the end of file

This patch should fix the issue.
2020-03-04 12:33:10 +03:00
Simon Tatham 068b2f313c [ARM,MVE] Add the `vshlcq` intrinsics.
Summary:
The VSHLC instruction performs a left shift of a whole vector register
by an immediate shift count up to 32, shifting in new bits at the low
end from a GPR and delivering the shifted-out bits from the high end
back into the same GPR.

Since the instruction produces two outputs (the shifted vector
register and the output GPR of shifted-out bits), it has to be
instruction-selected in C++ rather than Tablegen.

Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard

Reviewed By: miyuki

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75445
2020-03-04 08:49:27 +00:00
Simon Tatham 810127f6ab [ARM,MVE] Add the `vsbciq` intrinsics.
Summary:
These are exactly parallel to the existing `vadciq` intrinsics, which
we implemented last year as part of the original MVE intrinsics
framework setup.

Just like VADC/VADCI, the MVE VSBC/VSBCI instructions deliver two
outputs, both of which the intrinsic exposes: a modified vector
register and a carry flag. So they have to be instruction-selected in
C++ rather than Tablegen. However, in this case, that's trivial: the
same C++ isel routine we already have for VADC works unchanged, and
all we have to do is to pass it a different instruction id.

Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard

Reviewed By: miyuki

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75444
2020-03-04 08:49:27 +00:00
Craig Topper 9284abd004 [X86] Directly form VBROADCAST_LOAD for BUILD_VECTOR of splat loads in lowerBuildVectorAsBroadcast. 2020-03-03 22:27:34 -08:00
Juneyoung Lee 952ad4701c [ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into branch conditions of dominating blocks' terminators
Summary:
```
  br i1 c, BB1, BB2:
BB1:
  use1(c)
BB2:
  use2(c)
```

In BB1 and BB2, c is never undef or poison because otherwise the branch would have triggered UB.

Checked with Alive2

Reviewers: xbolva00, spatel, lebedev.ri, reames, jdoerfert, nlopes, sanjoy

Reviewed By: reames

Subscribers: jdoerfert, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75401
2020-03-04 11:43:31 +09:00
Amara Emerson e91e1df6ab [GlobalISel][Localizer] Enable intra-block localization of already-local uses.
This changes the localizer to attempt intra-block localizer of instructions
that have local uses. This is useful because sometimes the entry block itself
has many uses of constant-like instructions, which would benefit from shortening
live ranges. Previously if an inst had no non-local uses, we wouldn't add it to
the list of instructions to attempt further intra-block localization.

This gives a 0.7% geomean code size improvement on CTMark.

Differential Revision: https://reviews.llvm.org/D75555
2020-03-03 18:14:57 -08:00
Fangrui Song 7af4374ff8 [MC][test] Improve some llvm-objdump -t tests
Delete two redundant tests.
2020-03-03 17:27:06 -08:00
Fangrui Song 1a5da3f0b2 [gn build] Fix llvm-gsymutil after D75291 2020-03-03 16:37:52 -08:00
Fangrui Song 90acc505ed [MCDwarf] Change emitListsTableHeaderStart to use a reference and fold Start/End symbols generation into it
Apply @dblaikie's suggestions in a post-commit review for D75375

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D75568
2020-03-03 16:20:40 -08:00
Lang Hames 31e0331763 [ORC] Skip ST_File symbols in MaterializationUnit interfaces / resolution.
ST_File symbols aren't relevant for linking purposes, but can end up shadowing
real symbols if they're not filtered.

No test case yet: The ideal testcase for this would be an ELF llvm-jitlink test,
but llvm-jitlink support for ELF is still under development. We should add a
testcase for this once support lands in tree.
2020-03-03 16:15:44 -08:00
Stefanos Baziotis 6f5d5d6602 [LoopTerminology][NFC] Fix typo 2020-03-04 02:12:33 +02:00
Greg Clayton de2c586a12 Fix buildbots by including MC for StringTableBuilder. 2020-03-03 15:52:51 -08:00
Lang Hames 14ac84e5c5 [JITLink] Add a -slab-address option to llvm-jitlink.
This option can be used to for JITLink to link as-if the target memory slab were
allocated at a specific start address. This can be used to both verify that
cross-address space linking is working correctly, and to ensure that certain
address-sensitive optimizations (e.g. GOT and stub elimination) either do or do
not fire, depending on the requirements of the test case.

This argument is only valid for testing in conjunction with -noexec -slab-alloc,
and will produce an error if used without those arguments.
2020-03-03 14:25:51 -08:00
Matt Arsenault f9047ede58 LICM: Reorder condition checks
Check the fast math flag before the more expensive loop check.
2020-03-03 17:15:57 -05:00
Matt Arsenault 88aced1e45 AMDGPU: Fix computation for getOccupancyWithLocalMemSize
The computation here didn't really make sense to me, and reported
wildy different results depending on the flat work group size
attribute.

I think this should really report a range derived from the possible
work group size bounds, and only allow an occupancy that is a multiple
of the group size.
2020-03-03 17:15:57 -05:00
Brian Gesiak aa85b437a9 [Coroutines] Use dbg.declare for frame variables
Summary:
https://gist.github.com/modocache/ed7c62f6e570766c0f39b35dad675c2f
is an example of a small C++ program that uses C++20 coroutines that
is difficult to debug, due to the loss of debug info for variables that
"spill" across coroutine suspension boundaries. This patch addresses
that issue by inserting 'llvm.dbg.declare' intrinsics that point the
debugger to the variables' location at an offset to the coroutine frame.

With this patch, I confirmed that running the 'frame variable' commands in
https://gist.github.com/modocache/ed7c62f6e570766c0f39b35dad675c2f at
the specified breakpoints results in the correct values being printed
for coroutine frame variables 'i' and 'j' when using an lldb built from
trunk, as well as with gdb 8.3 (lldb 9.0.1, however, could not print the
values). The added test case also verifies this improved behavior.

The existing coro-debug.ll test case is also modified to reflect the
locations at which Clang actually places calls to 'dbg.declare', and
additional checks are added to ensure this patch works as intended in that
example as well.

Reviewers: vsk, jmorse, GorNishanov, lewissbaker, wenlei

Subscribers: EricWF, aprantl, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75338
2020-03-03 17:13:46 -05:00
Greg Clayton 90e40a0bda Rename "llvm-gsym" to "llvm-gsymutil" and fix dependencies.
Summary: This patch renames the "llvm-gsym" tool directory to "llvm-gsymutil". Dependencies are also reduced to the bare minimum for llvm-gsymutil.

Reviewers: aprantl, thakis

Subscribers: mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75291
2020-03-03 14:13:29 -08:00
Amy Huang 5b3b21f025 [DebugInfo] Fix for adding "returns cxx udt" option to functions in CodeView.
Summary:
This change checks for the return type in the frontend and adds a flag
to the DISubroutineType to indicate that the option should be added in
CodeViewDebug.

Previously function types sometimes appeared twice in the PDB: once with
"returns cxx udt" and once without.
See https://bugs.llvm.org/show_bug.cgi?id=44785.

Reviewers: rnk, asmith

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75215
2020-03-03 14:00:08 -08:00
Lang Hames ab16ef17e8 [JITLink] Fix a pointer-to-integer cast in jitlink::InProcessMemoryManager.
reinterpret_cast'ing the block base address directly to a uint64_t leaves the
high bits in an implementation-defined state, but JITLink expects them to be
zero. Switching to pointerToJITTargetAddress for the cast should fix this.

This should fix the jitlink test failures that we have seen on some of the
32-bit testers.
2020-03-03 13:53:00 -08:00
Sanjay Patel f95095e9f6 [AArch64] add tests for nnan/ninf/undef FP simplifications; NFC 2020-03-03 16:38:58 -05:00
Vedant Kumar b5b21812dc test: Adjust no-dbg-value-after-terminator.mir to use `not --crash` 2020-03-03 13:30:31 -08:00