Commit Graph

133217 Commits

Author SHA1 Message Date
Johannes Doerfert a19eb1de72 [OpenMP] Add match_{all,any,none} declare variant selector extensions.
By default, all traits in the OpenMP context selector have to match for
it to be acceptable. Though, we sometimes want a single property out of
multiple to match (=any) or no match at all (=none). We offer these
choices as extensions via
  `implementation={extension(match_{all,any,none})}`
to the user. The choice will affect the entire context selector not only
the traits following the match property.

The first user will be D75788. There we can replace
```
  #pragma omp begin declare variant match(device={arch(nvptx64)})
  #define __CUDA__

  #include <__clang_cuda_cmath.h>

  // TODO: Hack until we support an extension to the match clause that allows "or".
  #undef __CLANG_CUDA_CMATH_H__

  #undef __CUDA__
  #pragma omp end declare variant

  #pragma omp begin declare variant match(device={arch(nvptx)})
  #define __CUDA__

  #include <__clang_cuda_cmath.h>

  #undef __CUDA__
  #pragma omp end declare variant
```
with the much simpler
```
  #pragma omp begin declare variant match(device={arch(nvptx, nvptx64)}, implementation={extension(match_any)})
  #define __CUDA__

  #include <__clang_cuda_cmath.h>

  #undef __CUDA__
  #pragma omp end declare variant
```

Reviewed By: mikerice

Differential Revision: https://reviews.llvm.org/D77414
2020-04-07 23:33:24 -05:00
Kazu Hirata 91eb442fde [JumpThreading] NFC: Simplify ComputeValueKnownInPredecessorsImpl
Summary:
ComputeValueKnownInPredecessorsImpl is the main folding mechanism in
JumpThreading.cpp.  To avoid potential infinite recursion while
chasing use-def chains, it uses:

  DenseSet<std::pair<Value *, BasicBlock *>> &RecursionSet

to keep track of Value-BB pairs that we've processed.

Now, when ComputeValueKnownInPredecessorsImpl recursively calls
itself, it always passes BB as is, so the second element is always BB.

This patch simplifes the function by dropping "BasicBlock *" from
RecursionSet.

Reviewers: wmi, efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77699
2020-04-07 18:37:36 -07:00
Eli Friedman 565b56a72c [NFC] Clean up uses of LoadInst constructor. 2020-04-07 16:28:53 -07:00
Daniel Sanders 1adeeabb79 Add MIR-level debugify with only locations support for now
Summary:
Re-used the IR-level debugify for the most part. The MIR-level code then
adds locations to the MachineInstrs afterwards based on the LLVM-IR debug
info.

It's worth mentioning that the resulting locations make little sense as
the range of line numbers used in a Function at the MIR level exceeds that
of the equivelent IR level function. As such, MachineInstrs can appear to
originate from outside the subprogram scope (and from other subprogram
scopes). However, it doesn't seem worth worrying about as the source is
imaginary anyway.

There's a few high level goals this pass works towards:
* We should be able to debugify our .ll/.mir in the lit tests without
  changing the checks and still pass them. I.e. Debug info should not change
  codegen. Combining this with a strip-debug pass should enable this. The
  main issue I ran into without the strip-debug pass was instructions with MMO's and
  checks on both the instruction and the MMO as the debug-location is
  between them. I currently have a simple hack in the MIRPrinter to
  resolve that but the more general solution is a proper strip-debug pass.
* We should be able to test that GlobalISel does not lose debug info. I
  recently found that the legalizer can be unexpectedly lossy in seemingly
  simple cases (e.g. expanding one instr into many). I have a verifier
  (will be posted separately) that can be integrated with passes that use
  the observer interface and will catch location loss (it does not verify
  correctness, just that there's zero lossage). It is a little conservative
  as the line-0 locations that arise from conflicts do not track the
  conflicting locations but it can still catch a fair bit.

Depends on D77439, D77438

Reviewers: aprantl, bogner, vsk

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77446
2020-04-07 16:25:13 -07:00
Fangrui Song 624654fd64 [VE] Migrate to the getMachineMemOperand overload using llvm::Align
Just delete the deprecated overload because nothing uses it.
2020-04-07 16:04:54 -07:00
Matt Arsenault 6011627f51 CodeGen: More conversions to use Register 2020-04-07 18:54:36 -04:00
Fangrui Song d2ef8c1f2c [ThinLTO] Drop dso_local if a GlobalVariable satisfies isDeclarationForLinker()
dso_local leads to direct access even if the definition is not within this compilation unit (it is
still in the same linkage unit). On ELF, such a relocation (e.g. R_X86_64_PC32) referencing a
STB_GLOBAL STV_DEFAULT object can cause a linker error in a -shared link.

If the linkage is changed to available_externally, the dso_local flag should be dropped, so that no
direct access will be generated.

The current behavior is benign, because -fpic does not assume dso_local
(clang/lib/CodeGen/CodeGenModule.cpp:shouldAssumeDSOLocal).
If we do that for -fno-semantic-interposition (D73865), there will be an
R_X86_64_PC32 linker error without this patch.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D74751
2020-04-07 15:46:01 -07:00
Fangrui Song 2f8fb4d1cd [VE] Adapt aa26dd9858 and 2481f26ac3 2020-04-07 15:45:19 -07:00
Wei Mi b49eac71ad Recommit [SampleFDO] Add flag for partial profile.
Fix the error of show-prof-info.test on some platforms without zlib.

The common profile usage is to collect profile from a target and then use the profile to guide the optimized build for the same target. There are some cases that no profile can be collected for a target. In those cases, although no full profile is available, it is possible to have some partial profile collected from other targets to optimize common libraries and utilities. A flag is needed to tell the partial profile from the full profile apart, so compiler can use different strategy for them.

Differential Revision: https://reviews.llvm.org/D77426
2020-04-07 14:28:25 -07:00
Stanislav Mekhanoshin 96e51ed005 [AMDGPU] Implement copyPhysReg for 16 bit subregs
Differential Revision: https://reviews.llvm.org/D74937
2020-04-07 14:22:46 -07:00
Matt Arsenault 2481f26ac3 CodeGen: Use Register in TargetFrameLowering 2020-04-07 17:07:44 -04:00
Nikita Popov fe8abbf442 [BPI] Clear handles when releasing memory (NFC)
This reduces max-rss of sqlite compilation by 2.5%.
2020-04-07 22:51:01 +02:00
Matt Arsenault aa26dd9858 CodeGen: Use Register in more places 2020-04-07 15:59:40 -04:00
Wei Mi c5da949ae8 Revert "[SampleFDO] Add flag for partial profile." show-prof-info.test breaks on some platforms.
This reverts commit e3ba652a14.
2020-04-07 12:54:51 -07:00
Wei Mi e3ba652a14 [SampleFDO] Add flag for partial profile.
The common profile usage is to collect profile from a target and then use the profile to guide the optimized build for the same target. There are some cases that no profile can be collected for a target. In those cases, although no full profile is available, it is possible to have some partial profile collected from other targets to optimize common libraries and utilities. A flag is needed to tell the partial profile from the full profile apart, so compiler can use different strategy for them.

Differential Revision: https://reviews.llvm.org/D77426
2020-04-07 12:17:56 -07:00
Nemanja Ivanovic ecd8435483 [NFC][PowerPC] Fix register class for patterns using XXPERMDIs
There are a few patterns where we use a superclass for inputs to this
instruction rather than the correct class. This can sometimes lead to
unncessary copies.
2020-04-07 14:06:08 -05:00
Graham Sellers a19a56f6a1 [AMDGPU] Extend constant folding for logical operations
This patch extends existing constant folding in logical operations to
handle S_XNOR, S_NAND, S_NOR, S_ANDN2, S_ORN2, V_LSHL_ADD_U32 and
V_AND_OR_B32. Also added a couple of tests for existing folds.
2020-04-07 14:37:16 -04:00
Craig Topper c41685b16f [SelectionDAG] Make getZeroExtendInReg take a vector VT if the operand VT is a vector.
This removes a call to getScalarType from a bunch of call sites.
It also makes the behavior consistent with SIGN_EXTEND_INREG.

Differential Revision: https://reviews.llvm.org/D77631
2020-04-07 11:34:08 -07:00
Alexey Lapshin 88c2137b6d [DWARFLinker][dsymutil][NFC] Move DwarfStreamer into DWARFLinker.
For implementing "remove obsolete debug info in lld", it is neccesary
to have DWARF generation code implementation. dsymutil uses DwarfStreamer
for that purpose. DwarfStreamer uses AsmPrinter. It is considered OK
to use AsmPrinter based code in lld(D74169). This patch moves
DwarfStreamer implementation into DWARFLinker, so that it could be reused
from lld.

Generally, a better place for such a common DWARF generation code would be
not DWARFLinker but an additional separate library. Such a library could
contain a single version of DWARF generation routines and could also
be independent of AsmPrinter. At the current moment, DwarfStreamer
does not pretend to be such a general implementation of DWARF generation.
So I decided to put it into DWARFLinker since it is the only user
of DwarfStreamer.

Testing: it passes "check-all" lit testing. MD5 checksum for clang .dSYM
bundle matches for the dsymutil with/without that patch.

Reviewed By: JDevlieghere

Differential revision: https://reviews.llvm.org/D77169
2020-04-07 21:21:54 +03:00
Eli Friedman e9ac757f79 [AArch64] Don't expand memcmp in strict align mode.
7aecf232 fixed the bug where we would miscompile, but we still generate
a crazy amount of code. Turn off the expansion until someone implements
an appropriate heuristic.

Differential Revision: https://reviews.llvm.org/D77599
2020-04-07 10:53:36 -07:00
Matt Arsenault f596ab4066 AMDGPU: Use early return 2020-04-07 13:48:00 -04:00
Sam Clegg 5be42f36f5 [WebAssembly][MC] Fix leak of std::string members in MCSymbolWasm
Summary: Fixes: https://bugs.llvm.org/show_bug.cgi?id=45452

Subscribers: dschuff, jgravelle-google, hiraditya, aheejin, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77627
2020-04-07 10:38:43 -07:00
Stanislav Mekhanoshin 12a324393d [AMDGPU] Limit endcf-collapase to simple if
We can only collapse adjacent SI_END_CF if outer statement
belongs to a simple SI_IF, otherwise correct mask is not in the
register we expect, but is an argument of an S_XOR instruction.

Even if SI_IF is simple it might be lowered using S_XOR because
lowering is dependent on a basic block layout. It is not
considered simple if instruction consuming its output is
not an SI_END_CF. Since that SI_END_CF might have already been
lowered to an S_OR isSimpleIf() check may return false.

This situation is an opportunity for a further optimization
of SI_IF lowering, but that is a separate optimization. In the
meanwhile move SI_END_CF post the lowering when we already know
how the rest of the CFG was lowered since a non-simple SI_IF
case still needs to be handled.

Differential Revision: https://reviews.llvm.org/D77610
2020-04-07 10:27:23 -07:00
Matt Arsenault b281138a1b DAG: Use the correct getPointerTy in a few places
These should not be assuming address space 0. Calling getPointerTy is
generally the wrong thing to do, since you should already know the
type from the incoming IR.
2020-04-07 12:45:41 -04:00
Nikita Popov 259649a519 [RDA] Avoid full reprocessing of blocks in loops (NFCI)
RDA sometimes needs to visit blocks twice, to take into account
reaching defs coming in along loop back edges. Currently it handles
repeated visitation the same way as usual, which means that it will
scan through all instructions and their reg unit defs again. Not
only is this very inefficient, it also means that all reaching defs
in loops are going to be inserted twice.

We can do much better than this. The only thing we need to handle
is a new reaching def from a predecessor, which either needs to be
prepended to the reaching definitions (if there was no reaching def
from a predecessor), or needs to replace an existing predecessor
reaching def, if it is more recent. Since D77508 we only store the
most recent predecessor reaching def, so that's the only one that
may need updating.

This also has the nice side-effect that reaching definitions are
now automatically sorted and unique, so drop the llvm::sort() call
in favor of an assertion.

Differential Revision: https://reviews.llvm.org/D77511
2020-04-07 17:55:37 +02:00
Nikita Popov 76e987b372 [RDA] Don't pass down TraversedMBB (NFC)
Only pass the MachineBasicBlock itself down to helper methods,
they don't need to know about traversal. Move the debug print
into the main method.
2020-04-07 17:53:04 +02:00
Nikita Popov 361c29d7ba [RDA] Avoid inserting duplicate reaching defs (NFCI)
An instruction may define the same reg unit multiple times,
avoid inserting the same reaching def multiple times in that case.

Also print the reg unit, rather than the super-register, in the
debug code.
2020-04-07 17:50:38 +02:00
David Tenty b9245f14b7 [NFC][PowerPC] Cleanup 64-bit and Darwin CalleeSavedRegs
Summary:
- Remove the no longer used Darwin CalleeSavedRegs
- Combine the SVR464 callee saved regs and AIX64 since the two are (and should be) identical into PPC64
- Update tests for 64-bit CSR change

Reviewers: sfertile, ZarkoCA, cebowleratibm, jasonliu, #powerpc

Reviewed By: sfertile

Subscribers: wuzish, nemanjai, hiraditya, kbarton, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77235
2020-04-07 11:49:10 -04:00
Simon Pilgrim e3b6059776 [X86][SSE] combineX86ShufflesConstants - early out for zeroable vectors (PR45443)
Shuffle combining can insert zero byte sized elements into the shuffle mask, which combineX86ShufflesConstants will attempt to fold without taking into account whether the byte-sized type is legal (e.g. AVX512F only targets).

If we have a full-zeroable vector then we should just return a zero version of the root type, otherwise if the type isn't valid we should bail.

Fixes PR45443
2020-04-07 14:45:29 +01:00
Keith Walker 01dc10774e [ARM] unwinding .pad instructions missing in execute-only prologue
If the stack pointer is altered for local variables and we are generating
Thumb2 execute-only code the .pad directive is missing.

Usually the size of the adjustment is stored in a PC-relative location
and loaded into a register which is then added to the stack pointer.
However when we are generating execute-only code code the size of the
adjustment is instead generated using the MOVW/MOVT instruction pair.

As a by product of handling the execute-only case this also fixes an
existing issue that in the none execute-only case the .pad directive was
generated against the load of the constant to a register instruction,
instead of the instruction which adds the register to the stack pointer.

Differential Revision: https://reviews.llvm.org/D76849
2020-04-07 11:51:59 +01:00
Florian Hahn 6aabb109be [SCCP] Use ranges for predicate info conditions.
This patch updates the code that deals with conditions from predicate
info to make use of constant ranges.

For ssa_copy instructions inserted by PredicateInfo, we have 2 ranges:
1. The range of the original value.
2. The range imposed by the linked condition.

1. is known, 2. can be determined using makeAllowedICmpRegion. The
intersection of those ranges is the range for the copy.

With this patch, we get a nice increase in the number of instructions
eliminated by both SCCP and IPSCCP for some benchmarks:

For MultiSource, SPEC2000 & SPEC2006:

Tests: 237
Same hash: 170 (filtered out)
Remaining: 67
Metric: sccp.NumInstRemoved
Program                                        base    patch   diff
 test-suite...Source/Benchmarks/sim/sim.test    10.00   71.00  610.0%
 test-suite...CFP2000/177.mesa/177.mesa.test   361.00  1626.00 350.4%
 test-suite...encode/alacconvert-encode.test   141.00  602.00  327.0%
 test-suite...decode/alacconvert-decode.test   141.00  602.00  327.0%
 test-suite...CI_Purple/SMG2000/smg2000.test   1639.00 4093.00 149.7%
 test-suite...peg2/mpeg2dec/mpeg2decode.test    75.00  163.00  117.3%
 test-suite...T2006/401.bzip2/401.bzip2.test   358.00  513.00  43.3%
 test-suite...rks/FreeBench/pifft/pifft.test    11.00   15.00  36.4%
 test-suite...langs-C/unix-tbl/unix-tbl.test     4.00    5.00  25.0%
 test-suite...lications/sqlite3/sqlite3.test   541.00  667.00  23.3%
 test-suite.../CINT2000/254.gap/254.gap.test   243.00  299.00  23.0%
 test-suite...ks/Prolangs-C/agrep/agrep.test    25.00   29.00  16.0%
 test-suite...marks/7zip/7zip-benchmark.test   1135.00 1304.00 14.9%
 test-suite...lications/ClamAV/clamscan.test   1105.00 1268.00 14.8%
 test-suite...urce/Applications/lua/lua.test   398.00  436.00   9.5%

Metric: sccp.IPNumInstRemoved
Program                                        base   patch   diff
 test-suite...C/CFP2000/179.art/179.art.test     1.00   3.00  200.0%
 test-suite...006/447.dealII/447.dealII.test   429.00 1056.00 146.2%
 test-suite...nch/fourinarow/fourinarow.test     3.00   7.00  133.3%
 test-suite...CI_Purple/SMG2000/smg2000.test   818.00 1748.00 113.7%
 test-suite...ks/McCat/04-bisect/bisect.test     3.00   5.00  66.7%
 test-suite...CFP2000/177.mesa/177.mesa.test   165.00 255.00  54.5%
 test-suite...ediabench/gsm/toast/toast.test    18.00  27.00  50.0%
 test-suite...telecomm-gsm/telecomm-gsm.test    18.00  27.00  50.0%
 test-suite...ks/Prolangs-C/agrep/agrep.test    24.00  35.00  45.8%
 test-suite...TimberWolfMC/timberwolfmc.test    43.00  62.00  44.2%
 test-suite...encode/alacconvert-encode.test    46.00  66.00  43.5%
 test-suite...decode/alacconvert-decode.test    46.00  66.00  43.5%
 test-suite...langs-C/unix-tbl/unix-tbl.test    12.00  17.00  41.7%
 test-suite...peg2/mpeg2dec/mpeg2decode.test    31.00  41.00  32.3%
 test-suite.../CINT2000/254.gap/254.gap.test   117.00 154.00  31.6%

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D76611
2020-04-07 11:09:18 +01:00
Serguei Katkov b7e3759e17 [DAG] Consolidate require spill slot logic in lambda. NFC.
Move the logic whether lowering of deopt value requires a spill slot in
a separate lambda.

Reviewers: reames, dantrushin
Reviewed By: dantrushin
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D77629
2020-04-07 16:43:47 +07:00
Peter Smith 14c1e98754 [ARM] Remove condition that could never be true
From Arm v8 Architecture Reference Manual F5.1.84 LDREXD
The ldrexd instruction in Arm state has the following conditions:

t = UInt(Rt); t2 = t + 1; n = UInt(Rn);
if Rt<0> == '1' || t2 == 15 || n == 15 then UNPREDICTABLE;

In when Rt is odd or if Rt is 14 (making t2 15).

In the implementation when the pair is the UNPREDICTABLE R14_R15 we
would ideally return SOFT_FAIL. We can't because there is no R14_R15
value for us to return so we fail early returning FAIL.

The early return for registers outside the bounds of the table means
the check for Rt == 14 (0xE) redundant which causes a static analyzer
to flag the condition as never being true.

To fix the warning I've removed the check and replaced with a comment
explaining the difference with the specification.

Fixes pr41660

Differential Revision: https://reviews.llvm.org/D77463
2020-04-07 09:50:56 +01:00
Simon Tatham aab9e9de4d [Support,Windows] Tolerate failure of CryptGenRandom
Summary:
In `Unix/Process.inc`, we seed a random number generator from
`/dev/urandom` if possible, but if not, we're happy to fall back to
ordinary pseudorandom strategies, like the current time and PID.

The corresponding function on Windows calls `CryptGenRandom`, but it
//doesn't// have a fallback if that strategy fails. But `CryptGenRandom`
//can// fail, if a cryptography provider isn't properly initialized, or
occasionally (by our observation) simply intermittently.

If it's reasonable on Unix to implement traditional pseudorandom-number
seeding as a fallback, then it's surely reasonable to do the same on
Windows. So this patch adds a last-ditch use of ordinary rand(), using
much the same strategy as the Unix fallback code.

Reviewers: hans, sammccall

Reviewed By: hans

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77553
2020-04-07 09:18:12 +01:00
Pierre-vh 4fc59a468f Revert "[CodeGen][SelectionDAG] Flip Booleans More Often"
This reverts commit 23342bdcc8.
2020-04-07 09:09:10 +01:00
Pierre-vh 23342bdcc8 [CodeGen][SelectionDAG] Flip Booleans More Often
Differential Revision: https://reviews.llvm.org/D77201
2020-04-07 08:19:57 +01:00
Sam Clegg f0bbf3d086 [WebAssembly] EmscriptenEHSjLj: Mark more functions as imported
These should have been part of https://reviews.llvm.org/D77192

Differential Revision: https://reviews.llvm.org/D77358
2020-04-06 21:27:31 -07:00
Xiang1 Zhang 01a32f2bd3 Enable IBT(Indirect Branch Tracking) in JIT with CET(Control-flow Enforcement Technology)
Do not commit the llvm/test/ExecutionEngine/MCJIT/cet-code-model-lager.ll because it will
cause build bot fail(not suitable for window 32 target).

Summary:
This patch comes from H.J.'s 2bd54ce7fa

**This patch fix the failed llvm unit tests which running on CET machine. **(e.g. ExecutionEngine/MCJIT/MCJITTests)

The reason we enable IBT at "JIT compiled with CET" is mainly that:  the JIT don't know the its caller program is CET enable or not.
If JIT's caller program is non-CET, it is no problem JIT generate CET code or not.
But if JIT's caller program is CET enabled,  JIT must generate CET code or it will cause Control protection exceptions.

I have test the patch at llvm-unit-test and llvm-test-suite at CET machine. It passed.
and H.J. also test it at building and running VNCserver(Virtual Network Console), it works too.
(if not apply this patch, VNCserver will crash at CET machine.)

Reviewers: hjl.tools, craig.topper, LuoYuanke, annita.zhang, pengfei

Reviewed By: LuoYuanke

Subscribers: tstellar, efriedma, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76900
2020-04-07 09:48:47 +08:00
Jun Ma 46bff786bc [Coroutines] Remove alignment check in shouldBeMustTail
Differential Revision: https://reviews.llvm.org/D77362
2020-04-07 09:07:34 +08:00
Eli Friedman 3f13ee8a00 [NFC] Modernize misc. uses of Align/MaybeAlign APIs.
Use the current getAlign() APIs where it makes sense, and use Align
instead of MaybeAlign when we know the value is non-zero.
2020-04-06 17:53:04 -07:00
Eli Friedman 68b03aee1a Remove SequentialType from the type heirarchy.
Now that we have scalable vectors, there's a distinction that isn't
getting captured in the original SequentialType: some vectors don't have
a known element count, so counting the number of elements doesn't make
sense.

In some cases, there's a better way to express the commonality using
other methods. If we're dealing with GEPs, there's GEP methods; if we're
dealing with a ConstantDataSequential, we can query its element type
directly.

In the relatively few remaining cases, I just decided to write out
the type checks. We're talking about relatively few places, and I think
the abstraction doesn't really carry its weight. (See thread "[RFC]
Refactor class hierarchy of VectorType in the IR" on llvmdev.)

Differential Revision: https://reviews.llvm.org/D75661
2020-04-06 17:03:49 -07:00
Davide Italiano 8115e08b05 [MachineCSE] Don't carry the wrong location when hoisting
PR: 45425
<rdar://problem/61359768>

Differential Revision:  https://reviews.llvm.org/D77604
2020-04-06 16:36:22 -07:00
Daniel Sanders f27cea721e Add way to omit debug-location from MIR output
Summary:
In lieu of a proper pass that strips debug info, add a way
to omit debug-locations from the MIR output so that
instructions with MMO's continue to match CHECK's when
mir-debugify is used

Reviewers: aprantl, bogner, vsk

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77575
2020-04-06 16:22:01 -07:00
Nick Desaulniers 41ba80182c [CallSite Removal] a CallBase is never an IndirectCall for isInlineAsm
Summary:
Thanks to Bill Wendling (void) for the report and steps to reproduce.  It looks
like this was missed during r350508's cleanup of the CallSite split into
CallBase, CallInst, and CallBrInst.

This was exposed by running pgo on a callbr, which was creating a ptrtoint to
the inline asm thinking it was an indirect call. The relevant callchain looks
like:

    IndirectCallPromotionPlugin::run()
    -> PGOIndirectCallVisitor::findIndirectCalls()
      -> PGOIndirectCallVisitor::visitCallBase()
        -> CallBase::isIndirectCall()

Reviewers: void, chandlerc

Reviewed By: void

Subscribers: hiraditya, llvm-commits, craig.topper, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77600
2020-04-06 16:14:46 -07:00
Vedant Kumar 5f185a8999 [AddressSanitizer] Fix for wrong argument values appearing in backtraces
Summary:
In some cases, ASan may insert instrumentation before function arguments
have been stored into their allocas. This causes two issues:

1) The argument value must be spilled until it can be stored into the
   reserved alloca, wasting a stack slot.

2) Until the store occurs in a later basic block, the debug location
   will point to the wrong frame offset, and backtraces will show an
   uninitialized value.

The proposed solution is to move instructions which initialize allocas
for arguments up into the entry block, before the position where ASan
starts inserting its instrumentation.

For the motivating test case, before the patch we see:

```
 | 0033: movq %rdi, 0x68(%rbx)  |   | DW_TAG_formal_parameter     |
 | ...                          |   |   DW_AT_name ("a")          |
 | 00d1: movq 0x68(%rbx), %rsi  |   |   DW_AT_location (RBX+0x90) |
 | 00d5: movq %rsi, 0x90(%rbx)  |   |       ^ not correct ...     |
```

and after the patch we see:

```
 | 002f: movq %rdi, 0x70(%rbx)  |   | DW_TAG_formal_parameter     |
 |                              |   |   DW_AT_name ("a")          |
 |                              |   |   DW_AT_location (RBX+0x70) |
```

rdar://61122691

Reviewers: aprantl, eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77182
2020-04-06 15:59:25 -07:00
Daniel Sanders 35b7b0851b Allow MachineFunction to obtain non-const Function (to enable MIR-level debugify)
Summary:
To debugify MIR, we need to be able to create metadata and to do that, we
need a non-const Module. However, MachineFunction only had a const reference
to the Function preventing this.

Reviewers: aprantl, bogner

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77439
2020-04-06 15:19:21 -07:00
Daniel Sanders 15f7bc7857 Add option to limit Debugify to locations (omitting variables)
Summary:
It can be helpful to test behaviour w.r.t locations without having DEBUG_VALUE
around. In particular, because DEBUG_VALUE has the potential to change CodeGen
behaviour (e.g. hasOneUse() vs hasOneNonDbgUse()) while locations generally
don't.

Reviewers: aprantl, bogner

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77438
2020-04-06 15:04:55 -07:00
David Blaikie 5aead592f0 X86ISelLowering: Minor refactor to avoid redundant initialization while ensuring compiler warnings can hopefully still prove initialization
Based on post-commit review/discussion in fabe52a7412b
2020-04-06 14:25:52 -07:00
Konstantin Pyzhov 72e8754916 [AMDGPU] Disable 'Skip Uniform Regions' optimization by default for AMDGPU.
Reviewers: sameerds, dstuttard

Differential Revision: https://reviews.llvm.org/D77228
2020-04-06 09:05:58 -04:00
Leonard Chan a0222ac1f9 [AsmPrinter] Do not define local aliases for global objects in a comdat
A global symbol that is defined in a comdat should not generate an alias since
call sites that would've referred to that symbol will refer to their own
independent local aliases rather than the surviving global comdat one. This
could result in something that looks like:

```
ld.lld: error: relocation refers to a discarded section: .text._ZN3fbl8internal18NullFunctionTargetIvJjjPjEED1Ev.stub
>>> defined in user-x64-clang/obj/system/ulib/minfs/libminfs.a(minfs._sources.file.cc.o)
>>> section group signature: _ZN3fbl8internal18NullFunctionTargetIvJjjPjEED1Ev.stub
>>> prevailing definition is in user-x64-clang/obj/system/ulib/minfs/libminfs.a(minfs._sources.vnode.cc.o)
>>> referenced by function.h:169 (../../zircon/system/ulib/fbl/include/fbl/function.h:169)
>>>               minfs._sources.file.cc.o:(minfs::File::AllocateAndCommitData(std::__2::unique_ptr<minfs::Transaction, std::__2::default_delete<minfs::Transaction> >)) in archive user-x64-clang/obj/system/ulib/minfs/libminfs.a
```

We ran into this when experimenting with a new C++ ABI for fuchsia
(refer to D72959) which takes relative offsets between comdat'd functions
which is why the normal C++ user wouldn't run into this.

Differential Revision: https://reviews.llvm.org/D77429
2020-04-06 13:48:05 -07:00
Nick Desaulniers 5bc291be71 [SelectionDAG] fix predecessor list for INLINEASM_BRs' parent
Summary:
A bug report mentioned that LLVM was producing jumps off the end of a
function when using "asm goto with outputs". Further digging pointed to
MachineBasicBlocks that had their address taken and were indirect
targets of INLINEASM_BR being removed by BranchFolder, because their
 predecessor list was empty, so they appeared to have no entry.

This was a cascading failure caused earlier, during Pre-RA instruction
scheduling. We have a few special cases in Pre-RA instruction scheduling
where we split a MachineBasicBlock in two.  This requires careful
handing of predecessor and successor lists for a MachineBasicBlock that
was split, and careful handing of PHI MachineInstrs that referred to the
MachineBasicBlock before it was split.

The clue that led to this fix was the observation that many callers of
MachineBasicBlock::splice() frequently call
MachineBasicBlock::transferSuccessorsAndUpdatePHIs() to update their PHI
nodes after a splice. We don't want to reuse that method, as we have
custom successor transferring logic for this block split.

This patch fixes 2 pre-existing bugs, and adds tests.

The first bug was that MachineBasicBlock::splice() correctly handles
updating most successors and predecessors; we don't need to do anything
more than removing the previous fallthrough block from the first half of
the split block post splice. Previously, we were updating the successor
list incorrectly (updating successors updates predecessors).

The second bug was that PHI nodes that needed registers from the first
half of the split block were not having entries populated.  The register
live out information was correct, and the FuncInfo->PHINodesToUpdate was
correct. Specifically, the check in SelectionDAGISel::FinishBasicBlock:

    for (unsigned i = 0, e = FuncInfo->PHINodesToUpdate.size(); i != e; ++i) {
      MachineInstrBuilder PHI(*MF, FuncInfo->PHINodesToUpdate[i].first);
      if (!FuncInfo->MBB->isSuccessor(PHI->getParent()))
        continue;
      PHI.addReg(FuncInfo->PHINodesToUpdate[i].second).addMBB(FuncInfo->MBB);

was `continue`ing because FuncInfo->MBB tracks the second half of
the post-split block; no one was updating PHI entries for the first half
of the post-split block.

SelectionDAGBuilder::UpdateSplitBlock() already expects to perform
special handling for MachineBasicBlocks that were split post calls to
ScheduleDAGSDNodes::EmitSchedule(), so I'm confident that it's both
correct for ScheduleDAGSDNodes::EmitSchedule() to return the second half
of the split block `CopyBB` which updates `FuncInfo->MBB` (ie. the
current MachineBasicBlock being processed), and perform special handling
for this in SelectionDAGBuilder::UpdateSplitBlock().

Reviewers: void, craig.topper, efriedma

Reviewed By: void, efriedma

Subscribers: hfinkel, fhahn, MatzeB, efriedma, hiraditya, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76961
2020-04-06 13:46:39 -07:00
Matt Arsenault 869f05c834 AMDGPU: Remove dead paths for requiresUniformRegister
The extracts from control flow intrinsics are already properly handled
by divergence analysis. The inline asm case isn't dead, but has also
never really worked correctly so leave it as-is for now.
2020-04-06 16:15:10 -04:00
Francesco Petrogalli 53b7abdd23 [llvm][CodeGen] Avoid implicit cast of TypeSize to integer in `initActions`.
Reviewers: sdesmalen, efriedma

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77317
2020-04-06 19:46:11 +01:00
Masoud Ataei jaliseh 9ed0612cca Add InjectTLIMappings pass to new pass manager
This pass is created in d6de5f12d4 and tested
for new and legacy pass manager but never added to new pass manager pipeline.
I am adding it to new pass manager pipeline.

This pass is get used in Vector Function Database (VFDatabase) and without
this pass in new pass manager pipeline, none of the vector libraries are work
ing with new pass manager.

Related passes:
66c120f025
https://reviews.llvm.org/D74944

Differential revision: https://reviews.llvm.org/D75354
2020-04-06 13:16:48 -05:00
Craig Topper 07ed1fb597 [SelectionDAGBuilder] Fix ISD::FREEZE creation for structs with fields of different types.
The previous code used the type of the first field for the VT
passed to getNode for every field.

I've based the implementation here off what is done in visitSelect
as it removes the need to special case aggregates.

Differential Revision: https://reviews.llvm.org/D77093
2020-04-06 11:03:40 -07:00
Konstantin Pyzhov 51dc028314 Revert e1730cfeb3 2020-04-06 05:56:11 -04:00
Kirill Naumov 3f995ce8b5 [CFGPrinter][CallPrinter][polly] Adding distinct structure for CFGDOTInfo
The patch introduces the system to distinctively store the information
needed for the Control Flow Graph as well as the instrumentary needed for
the follow-up changes: BlockFrequencyInfo and BranchProbabilityInfo.
The patch is a part of sequence of three patches, related to graphs Heat Coloring.

Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu

Differential Revision: https://reviews.llvm.org/D76820
2020-04-06 17:42:54 +00:00
Konstantin Pyzhov e1730cfeb3 [AMDGPU] Disable 'Skip Uniform Regions' optimization by default for AMDGPU.
Reviewers: sameerds, dstuttard

Differential Revision: https://reviews.llvm.org/D77228
2020-04-06 05:10:37 -04:00
Fangrui Song a5d375e0cb [AArch64] Allow logical immediates to have all-1 in top bits
So that constant expressions like the following are permitted:

and w0, w0, #~(0xfe<<24)
and w1, w1, #~(0xff<<24)

The behavior matches GNU as (opcodes/aarch64-opc.c:aarch64_logical_immediate_p).

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D75885
2020-04-06 09:56:04 -07:00
Florian Hahn 7aba6a0333 [LV] Fix value that could be read uninitialized.
This should fix
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/18569
2020-04-06 17:54:50 +01:00
Nikita Popov e8b83f7ddc [RDA] Only store most recent reaching def from predecessors (NFCI)
When entering a basic block, RDA inserts reaching definitions coming
from predecessor blocks (which will be negative numbers) in a rather
peculiar way. If you have incoming reaching definitions -4, -3, -2, -1,
it will insert those. If you have incoming reaching definitions
-1, -2, -3, -4, it will insert -1, -1, -1, -1, as the max is taken
at each step. That's probably not what was intended...

However, RDA only actually cares about the most recent reaching
definition from a predecessor (to calculate clearance), so this
ends up working fine as far as behavior is concerned. It does
waste memory on unnecessary reaching definitions though.

This patch changes the implementation to first compute the most
recent reaching definition in one loop, and then insert only that
one in a separate loop.

Differential Revision: https://reviews.llvm.org/D77508
2020-04-06 18:39:09 +02:00
Nikita Popov 8d75df1438 [RDA] Don't adjust ReachingDefDefaultVal (NFCI)
At the end of a basic block, RDA adjusts all the reaching defs it
found to be relative to the end of the basic block, rather than the
start of it. However, it also does this to registers which don't
have a reaching def, indicated by ReachingDefDefaultVal. This means
that code checking against ReachingDefDefaultVal will not skip them,
and may insert them into the reaching definition list. This is
ultimately harmless, but causes unnecessary work and is logically
not right.

Differential Revision: https://reviews.llvm.org/D77506
2020-04-06 18:36:29 +02:00
Sanjay Patel fbb1b43f13 [ValueTracking] enhance matching of umin/umax with 'not' operands
The cmyk test is based on the known regression that resulted from:
rGf2fbdf76d8d0

This improves on the equivalent signed min/max change:
rG867f0c3c4d8c

The underlying icmp equivalence is:
  ~X pred ~Y --> Y pred X

For an icmp with constant, canonicalization results in a swapped pred:
  ~X < C -->  X > ~C
2020-04-06 11:51:59 -04:00
Matt Arsenault 8a5f0dafd4 AMDGPU/GlobalISel: Select llvm.amdgcn.div.scale 2020-04-06 11:50:19 -04:00
Matt Arsenault e87ec66762 AMDGPU/GlobalISel: Fix llvm.amdgcn.div.fmas.ll 2020-04-06 11:50:16 -04:00
Jay Foad ddd2f4b96f [AMDGPU] Fix inaccurate comments 2020-04-06 16:44:08 +01:00
Florian Hahn 90be3c24a7 [VPlan] Introduce new VPWidenCallRecipe (NFC).
This patch moves calls to their own recipe, to simplify the transition
to VPUser for operands of VPWidenRecipe, as discussed in D76992.

Subsequently additional information can be added to the recipe rather
than computing it during the execute step.

Reviewers: rengolin, Ayal, gilr, hsaito

Reviewed By: gilr

Differential Revision: https://reviews.llvm.org/D77467
2020-04-06 16:07:37 +01:00
Chris Bowler d6ea82d11c [AIX][PPC] Implement by-val caller arguments in multiple registers
Differential Revision: https://reviews.llvm.org/D76380
2020-04-06 11:06:51 -04:00
Guillaume Chatelet 808286342a [Alignment][NFC] Assume AlignmentFromAssumptions::getNewAlignment is always set.
Summary:
In D77454 we explain that `LoadInst` and `StoreInst` always have their alignment defined.
This allows to work backward here and to infer that `getNewAlignment` does not need to return `0` in case of failure.
Returning `1` also works since it needs to be greater than the Load/Store alignment which is a least `1`.

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77538
2020-04-06 14:54:57 +00:00
diggerlin a26a441b99 [llvm-objdump][XCOFF] Use symbol index+symbol name + storage mapping class as label for -D
SUMMARY:

For the llvm-objdump -D, the symbol name is used as a label in the disassembly for the specific address (when a symbol address is equal to the virtual address in the dump).

In XCOFF, multiple symbols may have the same name, being differentiated by their storage mapping class. It is helpful to print the QualName and not just the name when forming the output label for a csect symbol. The symbol index further removes any ambiguity caused by duplicate names.

To maintain compatibility with the binutils objdump, the XCOFF-specific --symbol-description option is added to enable the enhanced format.

Reviewers: hubert.reinterpretcast, James Henderson, Jason Liu ,daltenty
Subscribers: wuzish, nemanjai, hiraditya

Differential Revision: https://reviews.llvm.org/D72973
2020-04-06 10:10:10 -04:00
Benjamin Kramer 880ec421dd [MC] Use a byte_swap in emitIntValue instead of doing it in a loop. NFCI. 2020-04-06 15:51:24 +02:00
Florian Hahn 6babae74c7 [Matrix] Update load/storeMatrix to take indices as Value* (NFC).
This allows using the functions to be used with loop dependent indices.
2020-04-06 14:48:48 +01:00
Matt Arsenault cbf719b568 AMDGPU: Use DAG patterns for div_fmas 2020-04-06 09:28:30 -04:00
Matt Arsenault 79b29d6df7 AMDGPU: Remove DisableInst feature
I'm not sure why these were bothering to check the instruction
profile, since those profiles should only be used with these
instruction classes.
2020-04-06 09:27:44 -04:00
Matt Arsenault 70726cec5b DAG: Combine extract_vector_elt of concat_vectors
Fixes extra canonicalize regressions when legalizing
vector fminnum/fmaxnum.
2020-04-06 09:26:29 -04:00
Hans Wennborg 64c2312750 Revert 43f031d312 "Enable IBT(Indirect Branch Tracking) in JIT with CET(Control-flow Enforcement Technology)"
ExecutionEngine/MCJIT/cet-code-model-lager.ll is failing on 32-bit
windows, see llvm-commits thread for fef2dab.

This reverts commit 43f031d312
and the follow-ups fef2dab100 and
6a800f6f62.
2020-04-06 15:05:25 +02:00
Sourabh Singh Tomar 5d7e9adce2 [DWARF5] Added support for emission of debug_macro section.
Summary:
This patch adds support for emission of following DWARFv5 macro forms
in .debug_macro section.

1. DW_MACRO_start_file
2. DW_MACRO_end_file
3. DW_MACRO_define_strp
4. DW_MACRO_undef_strp.

Reviewed By: dblaikie, ikudrin

Differential Revision: https://reviews.llvm.org/D72828
2020-04-06 17:45:10 +05:30
Pavel Labath 9154a6398e [llvm/Support] Make more DataExtractor methods error-aware
Summary:
This patch adds the optional Error argument, and the Cursor variants to
more DataExtractor methods. The functions now behave the same way as
other error-aware functions (they set the error when they fail, and
don't do anything if the error is already set).

I have merged the LEB128 implementations via a template (similarly to
how fixed-size functions are handled) to reduce code duplication.

Depends on D77304.

Reviewers: dblaikie, aprantl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77306
2020-04-06 14:14:11 +02:00
Pavel Labath a16fffa3f6 [Support] Make DataExtractor string functions error-aware
Summary:
This patch adds an optional Error argument to DataExtractor functions
for string extraction, and makes them behave like other DataExtractor
functions (set the error if extraction fails, don't do anything if the
error is already set).

I have merged the StringRef and C string versions of the functions to
reduce code duplication.

Reviewers: dblaikie, MaskRay

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77307
2020-04-06 14:14:11 +02:00
Guillaume Chatelet ff858d7781 [Alignment][NFC] Add DebugStr and operator*
Summary:
This is a roll forward of D77394 minus AlignmentFromAssumptions (which needs to be addressed separately)
Differences from D77394:
 - DebugStr() now prints the alignment value or `None` and no more `Align(x)` or `MaybeAlign(x)`
   - This is to keep Warning message consistent (CodeGen/SystemZ/alloca-04.ll)
 - Removed a few unneeded headers from Alignment (since it's included everywhere it's better to keep the dependencies to a minimum)

Reviewers: courbet

Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77537
2020-04-06 12:09:45 +00:00
Guillaume Chatelet 39cfba9e33 [Alignment][NFC] Remove deprecated functions introduced in 10.0.0
Summary:
24 March 2020: LLVM 10.0.0 is out.
I gathered all deprecated function introduced between 9 and 10 and cleaned them up so they will be removed from 11.

> git log -p -S LLVM_ATTRIBUTE_DEPRECATED llvmorg-9.0.0..llvmorg-10.0.0

Reviewers: courbet

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77409
2020-04-06 12:07:18 +00:00
Simon Pilgrim 9bc5b1a489 [X86][SSE] combineVectorSignBitsTruncation - remove minimum vector length limitations
truncateVectorWithPACK has its own vector length controls, so we can rely on those directly. This helps some existing truncation to subvector tests, which were being combined later during shuffle lowering at which point the sign/zero bit detection had become obscured preventing lowerShuffleWithPACK working as well as it could.
2020-04-06 12:45:23 +01:00
Benjamin Kramer 232eff55f6 [LTO] Replace hand-rolled endian conversion with support::endian. NFCI. 2020-04-06 13:23:27 +02:00
Benjamin Kramer e64e516790 [RuntimeDyld] Replace hand-rolled endian conversion with support::endian. NFCI. 2020-04-06 13:22:53 +02:00
Benjamin Kramer 9a9bc23672 [llvm-bcanalyzer] Simplify code. NFCI. 2020-04-06 12:50:50 +02:00
Kazushi (Jam) Marukawa e981a46a77 [VE] Update lea/load/store instructions
Summary:
Modify lea/load/store instructions to accept `disp(index, base)`
style addressing mode (called ASX format).  Also, uniform the
number of DAG nodes to have 3 operands for this ASX format
instructions, and update selectADDR functions to lower
appropriate MI.

Reviewers: arsenm, simoll, k-ishizaka

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D76822
2020-04-06 11:49:46 +02:00
Oliver Stannard a294d9eb21 Revert "[IPRA][ARM] Spill extra registers at -Oz"
Reverting because this is causing failures on bots with expensive checks
enabled.

This reverts commit 73cea83a6f.
2020-04-06 10:34:59 +01:00
Kerry McLaughlin 944e322f88 [AArch64][SVE] Add SVE intrinsics for saturating add & subtract
Summary:
Adds the following intrinsics:
  - @llvm.aarch64.sve.[s|u]qadd.x
  - @llvm.aarch64.sve.[s|u]qsub.x

Reviewers: sdesmalen, c-rhodes, dancgr, efriedma, cameron.mcinally, rengolin

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77054
2020-04-06 10:07:08 +01:00
Florian Hahn 39f2d9aa81 [Matrix] Add option to use row-major matrix layout as default.
This patch adds a -matrix-default-layout option which can be used to
set the default matrix layout to row-major or column-major (default).

The initial patch updates codegen for loads, stores, binary operators
and matrix multiply.

Reviewers: anemet, Gerolf, andrew.w.kaylor, LuoYuanke

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D76325
2020-04-06 10:00:56 +01:00
Florian Hahn d1fed7081d [Matrix] Add initial tiling for load/multiply/store chains.
This patch adds initial fusion for load/multiply/store chains of matrix
operations.

The patch contains roughly two parts:

1. Code generation for a fused load/multiply/store chain (LowerMatrixMultiplyFused).
First, we ensure that both loads of the multiply operands do not alias the store.
If they do, we create new non-aliasing copies of the operands. Note that this
may introduce new basic block. Finally we process TileSize x TileSize blocks.
That is: load tiles from the input operands, multiply and store them.

2. Identify fusion candidates & matrix instructions.
As a first step, collect all instructions with shape info and fusion candidates
(currently @llvm.matrix.multiply calls). Next, try to fuse candidates and
collect instructions eliminated by fusion. Finally iterate over all matrix
instructions, skip the ones eliminated by fusion and lower the rest as usual.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D75566
2020-04-06 09:28:15 +01:00
Guillaume Chatelet 6000478f39 Revert "[Alignment][NFC] Add DebugStr and operator*"
This reverts commit 1e34ab98fc.
2020-04-06 07:55:25 +00:00
Guillaume Chatelet 1e34ab98fc [Alignment][NFC] Add DebugStr and operator*
Summary:
Also updates files to use them.

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77394
2020-04-06 07:12:46 +00:00
Igor Kudrin 35819ff3cf [DebugInfo] Fix reading range lists of v5 units in DWP.
In package files, the base offset provided by index sections should be
used to find the contribution of a unit. The patch adds that base
offset when reading range list tables.

Differential revision: https://reviews.llvm.org/D77401
2020-04-06 13:28:06 +07:00
Igor Kudrin a93b77b97f [DebugInfo] Fix reading location tables headers of v5 units in DWP.
This fixes the reading of location lists headers for compilation units
in package files by adjusting the reading offset according to the
corresponding record in the unit index. This is required for
DW_FORM_loclistx to work.

Differential revision: https://reviews.llvm.org/D77146
2020-04-06 13:28:06 +07:00
Igor Kudrin 49737df767 [DebugInfo] Fix reading location tables of v5 units in DWP.
Without the patch, all version 5 compile units in a DWP file read
location tables from the beginning of a .debug_loclists.dwo section.
The patch fixes that by adjusting the reading offset the same way as
for pre-v5 units. The section identifier to find the contribution
entry corresponds to the version of the unit.

Differential revision: https://reviews.llvm.org/D77145
2020-04-06 13:28:06 +07:00
Igor Kudrin 714324b79a [DebugInfo] Support DWARFv5 index sections.
DWARFv5 defines index sections in package files in a slightly different
way than the pre-standard GNU proposal, see Section 7.3.5 in the DWARF
standard and https://gcc.gnu.org/wiki/DebugFissionDWP for GNU proposal.
The main concern here is values for section identifiers, which are
partially overlapped with changed meanings. The patch adds support for
v5 index sections and resolves that difficulty by defining a set of
identifiers for internal use which can represent and distinct values
of both standards.

Differential Revision: https://reviews.llvm.org/D75929
2020-04-06 13:28:06 +07:00
Igor Kudrin a0249fe91c [DebugInfo] Rename section identifiers which are deprecated in DWARFv5. NFC.
This is a preparation for an upcoming patch which adds support for
DWARFv5 unit index sections. The patch adds tag "_EXT_" to identifiers
which reference sections that are deprecated in the DWARFv5 standard.
See D75929 for the discussion.

Differential Revision: https://reviews.llvm.org/D77141
2020-04-06 13:28:06 +07:00
Craig Topper 97e57f3b24 [DAGCombiner] Use getAnyExtOrTrunc instead of getSExtOrTrunc in the zext(setcc) combine.
We're ANDing with 1 right after which will cause the SIGN_EXTEND to
be combined to ANY_EXTEND later. Might as well just start with an
ANY_EXTEND.

While there replace create the AND using the getZeroExtendInReg
helper to remove the need to explicitly create the VecOnes constant.
2020-04-05 22:44:45 -07:00
Johannes Doerfert 931c0cd713 [OpenMP][NFC] Move and simplify directive -> allowed clause mapping
Move the listing of allowed clauses per OpenMP directive to the new
macro file in `llvm/Frontend/OpenMP`. Also, use a single generic macro
that specifies the directive and one allowed clause explicitly instead
of a dedicated macro per directive.

We save 800 loc and boilerplate for all new directives/clauses with no
functional change. We also need to include the macro file only once and
not once per directive.

Depends on D77112.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D77113
2020-04-06 00:04:08 -05:00
Craig Topper 586c051a27 [DAGCombiner] Replace a hardcoded constant in visitZERO_EXTEND with a proper check for the condition its trying to protect.
This code is replacing a shift with a new shift on an extended type.
If the shift amount type can't represent the maximum shift amount
for the new type, the amount needs to be extended to a type that
can.

Previously, the code just hardcoded a check for 256 bits which
seems to have been an assumption that the original shift amount
was MVT::i8. But that seems more catered to a specific target
like X86 that uses i8 as its legal shift amount type. Other
targets may use different types.

This commit changes the code to look at the real type of the shift
amount and makes sure it has enough bits for the Log2 of the
new type. There are similar checks to this in SelectionDAGBuilder
and LegalizeIntegerTypes.
2020-04-05 20:35:57 -07:00
Johannes Doerfert 419a559c5a [OpenMP][NFCI] Move OpenMP clause information to `lib/Frontend/OpenMP`
This is a cleanup and normalization patch that also enables reuse with
Flang later on. A follow up will clean up and move the directive ->
clauses mapping.

Reviewed By: fghanim

Differential Revision: https://reviews.llvm.org/D77112
2020-04-05 22:30:29 -05:00
Tarindu Jayatilaka b43b59fcc0 Expose `attributor-disable` to the new and old pass managers
The new and old pass managers (PassManagerBuilder.cpp and
PassBuilder.cpp) are exposed to an `extern` declaration of
`attributor-disable` option which will guard the addition of the
attributor passes to the pass pipelines.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D76871
2020-04-05 22:29:34 -05:00
Lang Hames 1b39c6f62c [ORC] Add MachO universal binary support to StaticLibraryDefinitionGenerator.
Add a new overload of StaticLibraryDefinitionGenerator::Load that takes a triple
argument and supports loading archives from MachO universal binaries in addition
to regular archives.

The LLI tool is updated to use this overload.
2020-04-05 20:21:05 -07:00
Simon Pilgrim a43e233606 Remove unused function 'isInRange'. NFCI. 2020-04-05 23:11:24 +01:00
Simon Pilgrim 4431a29c60 [X86][SSE] Combine unary shuffle(HORIZOP,HORIZOP) -> HORIZOP
We had previously limited the shuffle(HORIZOP,HORIZOP) combine to binary shuffles, but we can often merge unary shuffles just as well, folding in UNDEF/ZERO values into the 64-bit half lanes.

For the (P)HADD/HSUB cases this is limited to fast-horizontal cases but PACKSS/PACKUS combines under all cases.
2020-04-05 22:49:46 +01:00
Anna Thomas 1d0f757904 [InlineFunction] Update metadata on loads that are return values
This patch builds upon D76140 by updating metadata on pointer typed
loads in inlined functions, when the load is the return value, and the
callsite contains return attributes which can be updated as metadata on
the load.
Added test cases show this for nonnull, dereferenceable,
dereferenceable_or_null

Reviewed-By: jdoerfert

Differential Revision: https://reviews.llvm.org/D76792
2020-04-05 14:50:10 -04:00
Sourabh Singh Tomar 0d71782f4e [DebugInfo]: Allow DwarfCompileUnit to have line table symbol
Previously line table symbol was represented as `DIE::value_iterator`
inside `DwarfCompileUnit` and subsequent function `intStmtList`
was used to create a local `MCSymbol` to initialize it.

This patch removes `DIE::value_iterator` from `DwarfCompileUnit`
and intoduce `MCSymbol` for representing this units symbol for
`debug_line` section. As a result `applyStmtList` is also modified
 to utilize this. Further more a helper function `getLineTableStartSym`
is also introduced to get this symbol, this would be used by clients
which need to access this line table, i.e `debug_macro`.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D77489
2020-04-06 00:14:29 +05:30
Zuojian Lin a58c8a7866 Remove the additional constant which requires an extra register for statepoint lowering.
The newly-created constant zero will need an extra register to hold it
in the current statepoint lowering implementation. Remove it if there
exists one.
2020-04-05 11:22:09 -04:00
Apelete Seketeli 8aadb442d1 [scan-build] fix dead store warnings emitted on LLVM AMDGPU code base
This fixes dead store warnings of the type "dead assignment" reported
by Clang Static Analyzer.
2020-04-05 11:19:03 -04:00
Oliver Stannard cb6aeb2239 [ARM] Add data gathering hint instruction
Summary:
This patch upstreams support the optional ARMv8.0 Data Gathering Hint (DGH)
extension, which adds the Data Gathering Hint instruction to the hint
space.

See ARMv8.0-DGH in the Arm Architecture Reference Manual Armv8 for more
information.

Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, danielkiss, samparker

Reviewed By: SjoerdMeijer

Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77097
2020-04-05 15:21:00 +01:00
Oliver Stannard 6f60eb4a3c [ARM] Add enhanced counter virtualization system registers
Summary:
This patch upstreams support for the ARMv8.6A Enhanced Counter Virtualization
(ECV) extension, which adds 6 new system registers.

See ARMv8.6-ECV in the Arm Architecture Reference Manual Armv8 for more
information.

Reviewers: t.p.northover, rengolin, SjoerdMeijer, pcc, ab, chill

Reviewed By: SjoerdMeijer

Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77094
2020-04-05 15:18:35 +01:00
Sanjay Patel 538a8f0227 [InstCombine] convert bitcast-shuffle to vector trunc
As discussed in D76983, that patch can turn a chain of insert/extract
with scalar trunc ops into bitcast+extract and existing instcombine
vector transforms end up creating a shuffle out of that (see the
PhaseOrdering test for an example). Currently, that process requires
at least this sequence: -instcombine -early-cse -instcombine.

Before D76983, the sequence of insert/extract would reach the SLP
vectorizer and become a vector trunc there.

Based on a small sampling of public targets/types, converting the
shuffle to a trunc is better for codegen in most cases (and a
regression of that form is the reason this was noticed). The trunc is
clearly better for IR-level analysis as well.

This means that we can induce "spontaneous vectorization" without
invoking any explicit vectorizer passes (at least a vector cast op
may be created out of scalar casts), but that seems to be the right
choice given that we started with a chain of insert/extract, and the
backend would expand back to that chain if a target does not support
the op.

Differential Revision: https://reviews.llvm.org/D77299
2020-04-05 09:48:02 -04:00
Oliver Stannard 9e1455dc23 [ARM] Add ARMv8.6 Fine Grain Traps system registers
Summary:
This patch upstreams support for the ARMv8.6A Fine Grain Traps (FGT)
extension, which adds 5 new system registers.

See ARMv8.6-FGT in the Arm Architecture Reference Manual Armv8 for more
information.

Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, momchil.velikov

Reviewed By: SjoerdMeijer

Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76991
2020-04-05 14:28:18 +01:00
Sanjay Patel 4036a0af24 [InstCombine] enhance freelyNegateValue() by handling 'not'
This patch extends D77230. If we have a 'not' instruction inside a
negated expression, we can ignore extra uses of that op because the
negation has a one-to-one replacement: negate becomes increment.

Alive2 examples of the test cases:
http://volta.cs.utah.edu:8080/z/T5-u9P
http://volta.cs.utah.edu:8080/z/eT89L6

Differential Revision: https://reviews.llvm.org/D77459
2020-04-05 09:16:19 -04:00
Sanjay Patel 867f0c3c4d [ValueTracking] enhance matching of smin/smax with 'not' operands
The cmyk tests are based on the known regression that resulted from:
rGf2fbdf76d8d0

So this improvement in analysis might be enough to restore that commit.
2020-04-05 08:54:12 -04:00
Diogo Sampaio 59d10dc703 [ARM] add ARMv8.6-A Activity monitors virtualization extension
Summary:
This patch upstreams v8.6A activity monitors virtualization
assembler support, which consists of 32 new system
registers (two groups, each with 16 numbered registers).

See ARMv8.6-AMU in the Arm Architecture Reference Manual Armv8 for more
information.

Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, john.brawn, ostannard

Reviewed By: ostannard

Subscribers: LukeGeeson, dnsampaio, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76998
2020-04-05 13:31:06 +01:00
Benjamin Kramer ff889df356 [X86] Roll some loops. NFCI. 2020-04-05 13:59:50 +02:00
Florian Hahn 47ee404075 [ValueTracking] Use Inst::comesBefore in isValidAssumeForCtx (NFC).
D51664 added Instruction::comesBefore which should provide better
performance than the manual check.

Reviewers: rnk, nikic, spatel

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D76228
2020-04-05 12:38:04 +01:00
Simon Pilgrim 3079e51858 [X86][SSE] Generalize shuffle(HORIZOP,HORIZOP) -> HORIZOP combine
Our existing combine allows to merge the shuffle of 2 similar 64-bit wide 'horizontal ops' (HADD/PACK/etc.) if the shuffle was a UNPCK/MOVSD.

This patch generalizes this to decode any target shuffle mask that can be widened to a 128-bit repeating v2*64 mask, which helps us catch PBLENDW/PBLENDD cases.
2020-04-05 12:09:19 +01:00
Simon Pilgrim a17de6b91c [X86][SSE] truncateVectorWithPACK - upper undef for 128->64 packing
If we're packing from 128-bits to 64-bits then we don't need the RHS argument. This helps with register allocation, especially as we avoid repeating a use of the input value.
2020-04-05 11:47:36 +01:00
Matt Arsenault 6bfe28e92f AMDGPU: Fix annotate kernel features through casted calls
I thought I was testing this before, but the workitem id x
case isn't great since it's mandatory in the parent kernel.
2020-04-04 20:44:44 -04:00
Matt Arsenault 221890d709 AMDGPU: Add feature for fast f32 denormals 2020-04-04 20:01:24 -04:00
Stefanos Baziotis f3dd3a66d3 [Attributor] AAUndefinedBehavior: Use AAValueSimplify in memory accessing instructions.
Query AAValueSimplify on pointers in memory accessing instructions to take
advantage of the constant propagation (or any other value simplification) of such values.
2020-04-05 02:46:26 +03:00
Jonathan Roelofs 3ce77142a6 Revert "[DAG] Fix PR45049: LegalizeTypes crash"
This reverts commit 17673ae0b2.
2020-04-04 13:47:22 -06:00
Jonathan Roelofs 17673ae0b2 [DAG] Fix PR45049: LegalizeTypes crash
Sometimes LegalizeTypes knows about common subexpressions before SelectionDAG
does, leading to accidental SDValue removal before its reference count was
truly zero.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45049

https://reviews.llvm.org/D76994
2020-04-04 13:36:22 -06:00
Florian Hahn a2b18c5a08 [LV] Simplify tryToWiden as recipes are not re-used (NFC).
After 49d00824bb, VPWidenRecipe only stores a single instruction.
tryToWiden can simply return the widen recipe, like other helpers in
VPRecipeBuilder.
2020-04-04 18:30:50 +01:00
Heejin Ahn fc5d8b672b [WebAssembly] Fix a sanitizer error in WasmEHPrepare
Summary:
D77423 started using a dominator tree in WasmEHPrepare, but we deleted
BBs in `prepareThrows` before we used the domtree in `prepareEHPads`,
and those CFG changes were not reflected in the domtree. This uses
`DomTreeUpdater` to make sure we update the domtree every time we delete
BBs from the CFG. This fixes ubsan/msan/expensive_check errors caught in
LLVM buildbots.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77465
2020-04-04 09:57:07 -07:00
Nikita Popov 4ede730096 [InstCombine] Don't limit uses in eraseInstFromFunction()
eraseInstFromFunction() adds the operands of the erased instructions,
as those might now be dead as well. However, this is limited to
instructions with less than 8 operands.

This check doesn't make a lot of sense to me. As the instruction
gets removed afterwards, I don't see a potential for anything
overly pathological happening here (as we can only add those
operands to the worklist once). The impact on CTMark is in
the noise. We also have the same code in instruction sinking
and don't limit the operand count there.

Differential Revision: https://reviews.llvm.org/D77325
2020-04-04 18:37:30 +02:00
Luofan Chen eec6d87626 [Attributor] Deduce attributes for non-exact functions
This patch is based on D63312 and D63319. For now we create shallow wrappers for all functions that are IPO amendable.
See also [this github issue](https://github.com/llvm/llvm-project/issues/172).

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D76404
2020-04-04 11:34:58 -05:00
Heejin Ahn 2e9839729d [WebAssembly] Fix wasm.lsda() optimization in WasmEHPrepare
Summary:
When we insert a call to the personality function wrapper
(`_Unwind_CallPersonality`) for a catch pad, we store some necessary
info in `__wasm_lpad_context` struct and pass it. One of the info is the
LSDA address for the function. For this, we insert a call to
`wasm.lsda()`, which will be lowered down to the address of LSDA, and
store it in a field in `__wasm_lpad_context`.

There are exceptions to this personality call insertion: catchpads for
`catch (...)` and cleanuppads (for destructors) don't need personality
function calls, because we don't need to figure out whether the current
exception should be caught or not. (They always should.)

There was a little optimization to `wasm.lsda()` call insertion. Because
the LSDA address is the same throughout a function, we don't need to
insert a store of `wasm.lsda()` return value in every catchpad. For
example:
```
try {
  foo();
} catch (int) {
  // wasm.lsda() call and a store are inserted here, like, in
  // pseudocode,
  // %lsda = wasm.lsda();
  // store %lsda to a field in __wasm_lpad_context
  try {
    foo();
  } catch (int) {
    // We don't need to insert the wasm.lsda() and store again, because
    // to arrive here, we have already stored the LSDA address to
    // __wasm_lpad_context in the outer catch.
  }
}
```
So the previous algorithm checked if the current catch has a parent EH
pad, we didn't insert a call to `wasm.lsda()` and its store.

But this was incorrect, because what if the outer catch is `catch (...)`
or a cleanuppad?
```
try {
  foo();
} catch (...) {
  // wasm.lsda() call and a store are NOT inserted here
  try {
    foo();
  } catch (int) {
    // We need wasm.lsda() here!
  }
}
```
In this case we need to insert `wasm.lsda()` in the inner catchpad,
because the outer catchpad does not have one.

To minimize the number of inserted `wasm.lsda()` calls and stores, we
need a way to figure out whether we have encountered `wasm.lsda()` call
in any of EH pads that dominates the current EH pad. To figure that
out, we now visit EH pads in BFS order in the dominator tree so that we
visit parent BBs first before visiting its child BBs in the domtree.

We keep a set named `ExecutedLSDA`, which basically means "Do we have
`wasm.lsda()` either in the current EH pad or any of its parent EH
pads in the dominator tree?". This is to prevent scanning the domtree up
to the root in the worst case every time we examine an EH pad: each EH
pad only needs to examine its immediate parent EH pad.

- If any of its parent EH pads in the domtree has `wasm.lsda()`, this
  means we don't need `wasm.lsda()` in the current EH pad. We also insert
  the current EH pad in `ExecutedLSDA` set.
- If none of its parent EH pad has `wasm.lsda()`
  - If the current EH pad is a `catch (...)` or a cleanuppad, done.
  - If the current EH pad is neither a `catch (...)` nor a cleanuppad,
    add `wasm.lsda()` and the store in the current EH pad, and add the
    current EH pad to `ExecutedLSDA` set.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77423
2020-04-04 07:02:50 -07:00
Simon Pilgrim e5e719d885 [X86][SSE] lowerV8I16Shuffle - lower compaction shuffles using PACKUSDW(PBLENDW,PBLENDW) on SSE41+
Similar to the lowerV16I8Shuffle implementation, for binary compaction v8i16 shuffles we can avoid the PUNPCKLDQ(PSHUFB,PSHUFB) pattern on SSE41+ targets by using PACKUSDW and PBLENDW. Before SSE41 we would need to use PACKSSDW but that requires sign extension that seems to destroy any gains, even on targets without PSHUFB.

This is a bigger gain on AMD than Intel targets but should never be a regression, and avoiding the shuffle mask load(s) is always useful.

Noticed in codegen while dealing with PR31443.
2020-04-04 13:08:25 +01:00
Nikita Popov b90ea4f341 [IRBuilder] Move some code into the cpp file; NFC
Since D73835 we no longer need to define the whole IRBuilder
implementation in the header. This patch moves some of the larger
methods out of line, into the C++ file.

Differential Revision: https://reviews.llvm.org/D77332
2020-04-04 12:52:56 +02:00
Nikita Popov 6896d559f3 [VNCoercion] Use IRBuilderBase; NFC
And remove include from header.
2020-04-04 12:44:50 +02:00
Nikita Popov ebd5a1b049 [Reassociate] Use IRBuilderBase; NFC
And remove now unnecessary IRBuilder.h include in header.
2020-04-04 12:34:16 +02:00
Nikita Popov 1055e9e3c8 [IVDescriptors] Remove IRBuilder.h include; NFC
IVDescriptors.h itself does not reference IRBuilder at all.
Move the include into transformation passes that do.
2020-04-04 12:07:57 +02:00
Nikita Popov a5eb1236e3 [IVDescriptors] Remove unnecessary DemandedBits.h include; NFC
Forward declare DemandedBits in IVDescriptors, and move include
into the cpp file. Also drop the include from LoopUtils, which
does not need it at all.
2020-04-04 12:07:57 +02:00
Craig Topper 1d42c0db9a Revert "[X86] Add a Pass that builds a Condensed CFG for Load Value Injection (LVI) Gadgets"
This reverts commit c74dd640fd.

Reverting to address coding standard issues raised in post-commit
review.
2020-04-03 16:56:08 -07:00
Craig Topper a505ad58cf Revert "[X86] Add Support for Load Hardening to Mitigate Load Value Injection (LVI)"
This reverts commit 62c42e29ba

Reverting to address coding standard issues raised in post-commit
review.
2020-04-03 16:55:53 -07:00
Scott Constable 62c42e29ba [X86] Add Support for Load Hardening to Mitigate Load Value Injection (LVI)
After finding all such gadgets in a given function, the pass minimally inserts
LFENCE instructions in such a manner that the following property is satisfied:
for all SOURCE+SINK pairs, all paths in the CFG from SOURCE to SINK contain at
least one LFENCE instruction. The algorithm that implements this minimal
insertion is influenced by an academic paper that minimally inserts memory
fences for high-performance concurrent programs:

http://www.cs.ucr.edu/~lesani/companion/oopsla15/OOPSLA15.pdf

The algorithm implemented in this pass is as follows:

1. Build a condensed CFG (i.e., a GadgetGraph) consisting only of the following components:
  -SOURCE instructions (also includes function arguments)
  -SINK instructions
  -Basic block entry points
  -Basic block terminators
  -LFENCE instructions
2. Analyze the GadgetGraph to determine which SOURCE+SINK pairs (i.e., gadgets) are already mitigated by existing LFENCEs. If all gadgets have been mitigated, go to step 6.
3. Use a heuristic or plugin to approximate minimal LFENCE insertion.
4. Insert one LFENCE along each CFG edge that was cut in step 3.
5. Go to step 2.
6. If any LFENCEs were inserted, return true from runOnFunction() to tell LLVM that the function was modified.

By default, the heuristic used in Step 3 is a greedy heuristic that avoids
inserting LFENCEs into loops unless absolutely necessary. There is also a
CLI option to load a plugin that can provide even better optimization,
inserting fewer fences, while still mitigating all of the LVI gadgets.
The plugin can be found here: https://github.com/intel/lvi-llvm-optimization-plugin,
and a description of the pass's behavior with the plugin can be found here:
https://software.intel.com/security-software-guidance/insights/optimized-mitigation-approach-load-value-injection.

Differential Revision: https://reviews.llvm.org/D75937
2020-04-03 13:45:50 -07:00
Scott Constable c74dd640fd [X86] Add a Pass that builds a Condensed CFG for Load Value Injection (LVI) Gadgets
Adds a new data structure, ImmutableGraph, and uses RDF to find LVI gadgets and add them to a MachineGadgetGraph.

More specifically, a new X86 machine pass finds Load Value Injection (LVI) gadgets consisting of a load from memory (i.e., SOURCE), and any operation that may transmit the value loaded from memory over a covert channel, or use the value loaded from memory to determine a branch/call target (i.e., SINK).

Also adds a new target feature to X86: +lvi-load-hardening

The feature can be added via the clang CLI using -mlvi-hardening.

Differential Revision: https://reviews.llvm.org/D75936
2020-04-03 13:02:04 -07:00
Alina Sbirlea 688450c7f0 [GraphDiff] Extend GraphDiff to track a list of updates.
Summary:
This patch includes two extensions:
1. It extends the GraphDiff to also keep the original list of updates
after legalization, not just the deletes/insert vectors.
It also provides an API to pop the first update (the updates are store
in reverse, such that the first update is at the end of the list)
2. It adds a bool to mark whether the given updates should be applied as
given, or applied in reverse. This moves the task of reversing the
updates (when the caller needs this) to a functionality inside
GraphDiff, versus having the caller do this.

The two changes could be split into two patches, but they seemed
reasonably small to be reviewed together.

Reviewers: kuhar, dblaikie

Subscribers: hiraditya, george.burgess.iv, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77167
2020-04-03 12:10:36 -07:00
Scott Constable f95a67d8b8 [X86] Add RET-hardening Support to mitigate Load Value Injection (LVI)
Adding a pass that replaces every ret instruction with the sequence:

pop <scratch-reg>
lfence
jmp *<scratch-reg>

where <scratch-reg> is some available scratch register, according to the
calling convention of the function being mitigated.

Differential Revision: https://reviews.llvm.org/D75935
2020-04-03 12:08:34 -07:00
Matt Arsenault 30ebafaa56 CodeGen: Convert some TII hooks to use Register 2020-04-03 14:52:54 -04:00
Matt Arsenault 178050c3ba AMDGPU: Use Register in more places 2020-04-03 14:52:54 -04:00
Matt Arsenault e8dcb6d05e AMDGPU: Remove redundant virtual 2020-04-03 14:52:53 -04:00
Christopher Tetreault b600809688 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: kparzysz, sdesmalen, efriedma

Reviewed By: kparzysz

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77267
2020-04-03 11:26:51 -07:00
Stanislav Mekhanoshin 0462795095 [AMDGPU] Propagate AGPR RC from PHI to its PHI operands
We can fix register class of PHI based on its all AGPR uses.
That leaves behind all PHIs which were already processed
earlier. Propagate RC back to PHI operands of a PHI.

Differential Revision: https://reviews.llvm.org/D77344
2020-04-03 11:23:02 -07:00
Simon Pilgrim 2225797567 [YAMLParser] Scanner::setError - ensure we use the StringRef::iterator argument (PR45043)
As detailed on PR45043, static analysis was warning that the StringRef::iterator Position argument was being ignored and the function was hardwired to use the Current iterator.

This patch ensures we use the provided iterator and removes the (barely necessary) setError wrapper that always used Current.

Differential Revision: https://reviews.llvm.org/D76512
2020-04-03 18:55:38 +01:00
Sanjay Patel ce97ce3a5d [VectorCombine] try to form a better extractelement
Extracting to the same index that we are going to insert back into
allows forming select ("blend") shuffles and enables further transforms.

Admittedly, this is a quick-fix for a more general problem that I'm
hoping to solve by adding transforms for patterns that start with an
insertelement.

But this might resolve some regressions known to be caused by the
extract-extract transform (although I have not gotten more details on
those yet).

In the motivating case from PR34724:
https://bugs.llvm.org/show_bug.cgi?id=34724

The combination of subsequent instcombine and codegen transforms gets us this improvement:

  vmovshdup	%xmm0, %xmm2    ## xmm2 = xmm0[1,1,3,3]
  vhaddps	%xmm1, %xmm1, %xmm4
  vmovshdup	%xmm1, %xmm3    ## xmm3 = xmm1[1,1,3,3]
  vaddps	%xmm0, %xmm2, %xmm0
  vaddps	%xmm1, %xmm3, %xmm1
  vshufps	$200, %xmm4, %xmm0, %xmm0 ## xmm0 = xmm0[0,2],xmm4[0,3]
  vinsertps	$177, %xmm1, %xmm0, %xmm0 ## xmm0 = zero,xmm0[1,2],xmm1[2]

  -->

  vmovshdup	%xmm0, %xmm2    ## xmm2 = xmm0[1,1,3,3]
  vhaddps	%xmm1, %xmm1, %xmm1
  vaddps	%xmm0, %xmm2, %xmm0
  vshufps	$200, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[0,2],xmm1[0,3]

Differential Revision: https://reviews.llvm.org/D76623
2020-04-03 13:55:13 -04:00
Sylvain Audi e4ae0a2e97 [Support/Path] sys::path::replace_path_prefix fix and simplifications
Added unit tests for 2 scenarios that were failing.
Made replace_path_prefix back to 3 parameters instead of 5, simplifying the implementation. The other 2 were always used with the default value.

This commit is intended to be the first of 3:
1) simplify/fix replace_path_prefix.
2) use it in the context of -fdebug-prefix-map and -fmacro-prefix-map (see D76869).
3) Make Windows version of replace_path_prefix insensitive to both case and separators (slash vs backslash).

Differential Revision: https://reviews.llvm.org/D77223
2020-04-03 13:50:23 -04:00
Simon Pilgrim 34a497b765 [X86][SSE] lowerShuffleWithPACK - extend to use chained PACKs for larger truncations
Extend lowerShuffleWithPACK/matchShuffleWithPACK/createPackShuffleMask to handle compaction style shuffle masks that can be lowered to chains of PACKSS/PACKUS if their inputs are suitably sign/zero extended.

This helps avoid PSHUFB (and its mask load) for short shuffle chains, shuffle combining will still replace with a PSHUFB if we have enough shuffles as getFauxShuffleMask should recognise the PACKSS/PACKUS chains.
2020-04-03 18:26:10 +01:00
Roman Lebedev 7d572ef2dd
Revert "[SCEV] rewriteLoopExitValues(): even if have hard uses, still rewrite if cheap (PR44668)"
As discussed in post-commit review in https://reviews.llvm.org/D73501
if the goal of this is to help vectorizer, then we should actually
be teaching vectorizer to do this, because right now this rewrite
is still budget-limited, which isn't what we'd want.

Additionally, while the rest of the patch series was universally profitable,
this particular patch is reportedly (https://reviews.llvm.org/D73501#1905171)
exposing cost-modeling issues on ARM.

So let's just back this particular patch out. Once there's an undo transform,
this could be considered for reintegration.

This reverts commit 44edc6fd2c.
2020-04-03 20:15:04 +03:00
John Brawn 4ad9ca0f9e [ARM] Fix incorrect handling of big-endian vmov.i64
Currently when the target is big-endian vmov.i64 reverses the order of the two
words of the vector. This is correct only when the underlying element type is
32-bit, as actually what it should be doing is considering it a vector of the
underlying type and reversing the elements of that.

Differential Revision: https://reviews.llvm.org/D76515
2020-04-03 17:36:50 +01:00
John Brawn cd58fb6325 [ARM] Avoid pointless vrev of element-wise vmov
If we have an element-wise vmov immediate instruction then a subsequent vrev
with width greater or equal to the vmov element width, then that vrev won't do
anything. Add a DAG combine to convert bitcasts that would become such vrevs
into vector_reg_casts instead.

Differential Revision: https://reviews.llvm.org/D76514
2020-04-03 17:36:50 +01:00
Matt Arsenault 57a55313c3 InstCombine: Reduce minnum/maxnum if inputs are casted 2020-04-03 11:57:25 -04:00
jasonliu d65557d15d [NFC][XCOFF][AIX] Refactor get/setContainingCsect
Summary:
For current architect, we always require setContainingCsect to be
called on every MCSymbol got used in XCOFF context.
This is very hard to achieve because symbols gets created everywhere
 and other MCSymbol types(ELF, COFF) do not have similar rules.
It's very easy to miss setting the containing csect, and we would
need to add a lot of XCOFF specialized code around some common code area.

This patch intendeds to do
1. Rely on getFragment().getParent() to get csect from labels.
2. Only use get/setRepresentedCsect (was get/setContainingCsect)
if symbol itself represents a csect.

Reviewers: DiggerLin, hubert.reinterpretcast, daltenty

Differential Revision: https://reviews.llvm.org/D77080
2020-04-03 13:33:12 +00:00
Guillaume Chatelet 9068bccbae [Alignment][NFC] Deprecate InstrTypes getRetAlignment/getParamAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77312
2020-04-03 13:21:58 +00:00
Guillaume Chatelet 1a584a8d50 [Alignment][NFC] Remove unused private functions
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77297
2020-04-03 09:16:20 +00:00
Guillaume Chatelet ca11c480e7 [Alignment][NFC] Convert MachineIRBuilder::buildDynStackAlloc to Align
Summary:
The change in IRTranslator is not trivial but is NFC as far as I can tell.

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77292
2020-04-03 09:05:19 +00:00
OCHyams 9b56cc9361 [DebugInfo] Salvage debug info when sinking loop invariant instructions
Reviewed By: vsk, aprantl, djtodoro

Differential Revision: https://reviews.llvm.org/D77318
2020-04-03 09:19:26 +01:00
Guillaume Chatelet 9f5c786876 [NFC] G_DYN_STACKALLOC realign iff align > 1, update documentation
Summary: I think it would be better to require the alignment to be >= 1. It is currently confusing to allow both values.

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77372
2020-04-03 08:12:39 +00:00
scentini 6825920b18 Silence -Wpessimizing-move warning 2020-04-03 09:37:39 +02:00
Scott Constable 5b519cf1fc [X86] Add Indirect Thunk Support to X86 to mitigate Load Value Injection (LVI)
This pass replaces each indirect call/jump with a direct call to a thunk that looks like:

lfence
jmpq *%r11

This ensures that if the value in register %r11 was loaded from memory, then
the value in %r11 is (architecturally) correct prior to the jump.
Also adds a new target feature to X86: +lvi-cfi
("cfi" meaning control-flow integrity)
The feature can be added via clang CLI using -mlvi-cfi.

This is an alternate implementation to https://reviews.llvm.org/D75934 That merges the thunk insertion functionality with the existing X86 retpoline code.

Differential Revision: https://reviews.llvm.org/D76812
2020-04-03 00:34:39 -07:00
scentini 0a3845b70f Silence -Wpessimizing-move warning 2020-04-03 09:24:26 +02:00
Igor Kudrin f13ce15d44 [DebugInfo] Rename getOffset() to getContribution(). NFC.
The old name was a bit misleading because the functions actually return
contributions to the corresponding sections.

Differential revision: https://reviews.llvm.org/D77302
2020-04-03 14:15:53 +07:00
Sourabh Singh Tomar 69c8fb1c65 [DWARF5] Added support for debug_macro section parsing and dumping in llvm-dwarfdump.
Summary:
This patch adds parsing and dumping DWARFv5 .debug_macro section in llvm-dwarfdump,
it does not introduce any new switch. Existing switch "--debug-macro"
should be used to dump macinfo or macro section.

Reviewed By: dblaikie, ikudrin, jhenderson

Differential Revision: https://reviews.llvm.org/D73086
2020-04-03 12:23:51 +05:30
Serguei Katkov bd1d70bf0e [DAG] Change isGCValue detection for statepoint lowering
isGCValue should detect whether the deopt value is a GC pointer.
Currently it checks by finding the value in SI.Bases and SI.Ptrs.
However these data structures contain only those values which
have corresponding gc.relocate call. So we can miss GC value if it
does not have gc.relocate call (dead after the call).

Check GC strategy whether pointer is GC one or consider any pointer
to be GC one conservatively.

Reviewers: reames, dantrushin
Reviewed By: reames
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D77130
2020-04-03 12:36:13 +07:00
Scott Constable b1d581019f [X86] Refactor X86IndirectThunks.cpp to Accommodate Mitigations other than Retpoline
Introduce a ThunkInserter CRTP base class from which new thunk types can inherit, e.g., thunks to mitigate https://software.intel.com/security-software-guidance/software-guidance/load-value-injection.

Differential Revision: https://reviews.llvm.org/D76811
2020-04-02 22:09:54 -07:00
Scott Constable 71e8021d82 [X86][NFC] Generalize the naming of "Retpoline Thunks" and related code to "Indirect Thunks"
There are applications for indirect call/branch thunks other than retpoline for Spectre v2, e.g.,

https://software.intel.com/security-software-guidance/software-guidance/load-value-injection

Therefore it makes sense to refactor X86RetpolineThunks as a more general capability.

Differential Revision: https://reviews.llvm.org/D76810
2020-04-02 21:55:13 -07:00
Hongtao Yu 88da019977 Fix a bug in the inliner that causes subsequent double inlining
Summary:
A recent change in the instruction simplifier enables a call to a function that just returns one of its parameter to be simplified as simply loading the parameter. This exposes a bug in the inliner where double inlining may be involved which in turn may cause compiler ICE when an already-inlined callsite is reused for further inlining.
To put it simply, in the following-like C program, when the function call second(t) is inlined, its code t = third(t) will be reduced to just loading the return value of the callsite first(). This causes the inliner internal data structure to register the first() callsite for the call edge representing the third() call, therefore incurs a double inlining when both call edges are considered an inline candidate. I'm making a fix to break the inliner from reusing a callsite for new call edges.

```
void top()
{
    int t = first();
    second(t);
}

void second(int t)
{
   t = third(t);
   fourth(t);
}

void third(int t)
{
   return t;
}
```
The actual failing case is much trickier than the example here and is only reproducible with the legacy inliner. The way the legacy inliner works is to process each SCC in a bottom-up order. That means in reality function first may be already inlined into top, or function third is either inlined to second or is folded into nothing. To repro the failure seen from building a large application, we need to figure out a way to confuse the inliner so that the bottom-up inlining is not fulfilled. I'm doing this by making the second call indirect so that the alias analyzer fails to figure out the right call graph edge from top to second and top can be processed before second during the bottom-up.  We also need to tweak the test code so that when the inlining of top happens, the function body of second is not that optimized, by delaying the pass of function attribute deducer (i.e, which tells function third has no side effect and just returns its parameter). Since the CGSCC pass is iterative, additional calls are added to top to postpone the inlining of second to the second round right after the first function attribute deducing pass is done. I haven't been able to repro the failure with the new pass manager since the processing order of ininlined callsites is a bit different, but in theory the issue could happen there too.

Note that this fix could introduce a side effect that blocks the simplification of inlined code, specifically for a call site that can be folded to another call site. I hope this can probably be complemented by subsequent inlining or folding, as shown in the attached unit test. The ideal fix should be to separate the use of VMap. However, in reality this failing pattern shouldn't happen often. And even if it happens, there should be a good chance that the non-folded call site will be refolded by iterative inlining or subsequent simplification.

Reviewers: wenlei, davidxl, tejohnson

Reviewed By: wenlei, davidxl

Subscribers: eraman, nikic, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76248
2020-04-02 21:08:05 -07:00
Xiang1 Zhang 43f031d312 Enable IBT(Indirect Branch Tracking) in JIT with CET(Control-flow Enforcement Technology)
Summary:
This patch comes from H.J.'s 2bd54ce7fa

**This patch fix the failed llvm unit tests which running on CET machine. **(e.g. ExecutionEngine/MCJIT/MCJITTests)

The reason we enable IBT at "JIT compiled with CET" is mainly that:  the JIT don't know the its caller program is CET enable or not.
If JIT's caller program is non-CET, it is no problem JIT generate CET code or not.
But if JIT's caller program is CET enabled,  JIT must generate CET code or it will cause Control protection exceptions.

I have test the patch at llvm-unit-test and llvm-test-suite at CET machine. It passed.
and H.J. also test it at building and running VNCserver(Virtual Network Console), it works too.
(if not apply this patch, VNCserver will crash at CET machine.)

Reviewers: hjl.tools, craig.topper, LuoYuanke, annita.zhang, pengfei

Subscribers: tstellar, efriedma, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76900
2020-04-03 11:44:07 +08:00
Jessica Paquette 71947ed927 [AArch64][GlobalISel] Constrain reg operands in selectBrJT
This was causing a machine verifier failure on the test suite.

Make sure that we don't end up with a weird register class here.

Failure for reference:

*** Bad machine code: Illegal virtual register for instruction ***
- function:    check_constrain
- basic block: %bb.1  (0x7f8b70839f80)
- instruction: early-clobber %6:gpr64, early-clobber %7:gpr64sp =
  JumpTableDest32 %5:gpr64, %1:gpr64sp, %jump-table.0
- operand 3:   %1:gpr64sp
Expected a GPR64 register, but got a GPR64sp register

Differential Revision: https://reviews.llvm.org/D77349
2020-04-02 20:34:11 -07:00
Wenju He fe8ac0fe51 [x86] Fix Intel OpenCL builtin CalleeSavedRegs on skx
Summary: Align with AVX512 builtins implementations, some of which don't preserve rdi.

Reviewers: yubing, tianqing, craig.topper

Reviewed By: craig.topper

Subscribers: yaxunl, Anastasia, hiraditya

Differential Revision: https://reviews.llvm.org/D77032
2020-04-03 11:27:40 +08:00
Qiu Chaofan 71f1ab5354 [PowerPC] Remove unnecessary XSRSP instruction
MI peephole will remove unnecessary FRSP instructions. This patch
removes such unnecessary XSRSP.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D77208
2020-04-03 11:05:14 +08:00
Jun Ma 9c6f32a0ff [Coroutines] Simplify implementation using removePredecessor
Differential Revision: https://reviews.llvm.org/D77035
2020-04-03 09:20:07 +08:00
Austin Kerbow 30f18ed387 [AMDGPU] Handle SMRD signed offset immediate
Summary:
This fixes a few issues related to SMRD offsets. On gfx9 and gfx10 we have a
signed byte offset immediate, however we can overflow into a negative since we
treat it as unsigned.

Also, the SMRD SOFFSET sgpr is an unsigned offset on all subtargets. We
sometimes tried to use negative values here.

Third, S_BUFFER instructions should never use a signed offset immediate.

Differential Revision: https://reviews.llvm.org/D77082
2020-04-02 17:41:52 -07:00
Adrian Prantl 93fe58c9cf Teach the stripNonLineTableDebugInfo pass about the llvm.dbg.label intrinsic.
Debug info for labels is not generated at -gline-tables-only, so this
pass should remove them.

Differential Revision: https://reviews.llvm.org/D77345
2020-04-02 17:39:33 -07:00
Adrian Prantl c024f3ebdc Teach the stripNonLineTableDebugInfo pass about the llvm.dbg.addr intrinsic.
This patch also strips llvm.dbg.addr intrinsics when downgrading debug
info to linetables-only.

Differential Revision: https://reviews.llvm.org/D77343
2020-04-02 17:39:33 -07:00
Lang Hames 05598441de Re-apply 0071eaaf08, "[ORC] Export __cxa_atexit ...", with fixes.
Forgot to include part of the testcase. Thank to Nico for spotting that and
reverting!
2020-04-02 16:03:35 -07:00
Matt Arsenault f68cc2a7ed AMDGPU: Use 128-bit DS operations by default 2020-04-02 17:17:47 -04:00
Matt Arsenault 5660bb6bc9 AMDGPU: Remove denormal subtarget features
Switch to using the denormal-fp-math/denormal-fp-math-f32 attributes.
2020-04-02 17:17:12 -04:00
Matt Arsenault 75cf30918f AMDGPU: Assume f32 denormals are enabled by default
This will likely introduce catastrophic performance regressions on
older subtargets, but should be correct. A follow up change will
remove the old fp32-denormals subtarget features, and switch to using
the new denormal-fp-math/denormal-fp-math-f32 attributes. Frontends
should be making sure to add the denormal-fp-math-f32 attribute when
appropriate to avoid performance regressions.
2020-04-02 17:17:12 -04:00
Cyndy Ishida fd4d07517b [llvm][TextAPI] adding inlining reexported libraries support
Summary:
[llvm][TextAPI] adding inlining reexported libraries support

* this patch adds reader/writer support for MachO tbd files.
The usecase is to represent reexported libraries in top level library
that won't need to exist for linker indirection because all of the
needed content will be inlined in the same document.

Reviewers: ributzka, steven_wu, jhenderson

Reviewed By: ributzka

Subscribers: JDevlieghere, hiraditya, mgrang, dexonsmith, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67646
2020-04-02 13:05:08 -07:00
Craig Topper 4fdb63bbf0 [X86] Enable combineExtSetcc for vectors larger than 256 bits when we've disabled 512 bit vectors.
The compares are going to be type legalized to 256 bits so we
might as well fold the extend.
2020-04-02 12:44:27 -07:00
Anna Thomas bf7a16a768 [InlineFunction] Update valid return attributes at callsite within callee body
Consider a callee function that has a call (C) within it which feeds
into the return. When we inline that callee into a callsite that has
return attributes, we can backward propagate valid attributes to the
call (C) within that inlined callee body.

This is safe to do so only if we can guarantee transfer of execution to
successor in the window of instructions between return value (i.e. the
call C) and the return instruction.

Also, this is valid only for attributes which are a property of a
callsite and not those that are not dependent on the ABI, or a property
of the call itself.

Reviewed-By: reames, jdoerfert

Differential Revision: https://reviews.llvm.org/D76140
2020-04-02 14:13:12 -04:00
Matt Arsenault c3d3c22a58 AMDGPU: Hack out noinline on functions using LDS globals
This is a workaround for clang adding noinline to all functions at
-O0. Previously, we would just add alwaysinline, and the verifier
would complain about having both noinline and alwaysinline. We
currently can't truly codegen this case as a freestanding function, so
override the user forcing noinline.
2020-04-02 14:12:07 -04:00
Sanjay Patel f4448063cc [InstCombine] try to reduce shuffle with bitcasted operand
shuf (bitcast X), undef, Mask --> bitcast X'

The 'inverse shuffles' test (shuf_bitcast_operand) is a pattern
in the motivating examples from PR35454:
https://bugs.llvm.org/show_bug.cgi?id=35454
(see also D76727)

We can deal with this class of patterns in generic instcombine
because we are not creating any new shuffles, just a bitcast.

Alive2 proof:
http://volta.cs.utah.edu:8080/z/mwDUZf

Differential Revision: https://reviews.llvm.org/D76844
2020-04-02 13:44:50 -04:00
Sanjay Patel b6050ca181 [VectorCombine] transform bitcasted shuffle to narrower elements
bitcast (shuf V, MaskC) --> shuf (bitcast V), MaskC'

We do not attempt this in InstCombine because we do not want to change
types and create new shuffle ops that are potentially not lowered as
well as the original code. Here, we can check the cost model to see if
it is worthwhile.

I've aggressively enabled this transform even if the types are the same
size and/or equal cost because moving the bitcast allows InstCombine to
make further simplifications.

In the motivating cases from PR35454:
https://bugs.llvm.org/show_bug.cgi?id=35454
...this is enough to let instcombine and the backend eliminate the
redundant shuffles, but we probably want to extend VectorCombine to
handle the inverse pattern (shuffle-of-bitcast) to get that
simplification directly in IR.

Differential Revision: https://reviews.llvm.org/D76727
2020-04-02 13:30:22 -04:00
Stanislav Mekhanoshin f2334a7ef2 [AMDGPU] Fix crash in SILoadStoreOptimizer
SILoadStoreOptimizer::checkAndPrepareMerge() expects base and
paired instruction to come in order and scans MBB from base to
the paired instruction. An original order can be changed if
there were a dependent instruction in between and base instruction
was moved.

Fixed by bailing the optimization. In theory it might be possible
still to perform a merge by swapping instructions, but on practice
it bails anyway because it finds dependency on that same instruction
which has resulted in the base move.

Differential Revision: https://reviews.llvm.org/D77245
2020-04-02 10:26:47 -07:00
Sanjay Patel 12fcbcecff [InstCombine] add tests for cmyk benchmark; NFC
These are versions of a function that regressed with:
rGf2fbdf76d8d0

That particular problem occurs with an instcombine-simplifycfg-instcombine
sequence, but we can show that it exists within instcombine only with
other variations of the pattern.
2020-04-02 13:00:46 -04:00
Benjamin Kramer de8831934a [LoopDataPrefetch] Remove unused include that's a layering violation 2020-04-02 17:46:10 +02:00
Benjamin Kramer dffc503187 Revert "[SimplifyLibCalls] Erase replaced instructions"
This reverts commit 2a77544ad5. This
introduces a use-after-free in Transforms/InstCombine/sincospi.ll.
Found by asan.
2020-04-02 17:30:47 +02:00
Jonas Paulsson 7e02da7db5 [SystemZ] Add isCommutable flag on vector instructions.
This does not change much in code generation, but in rare cases MachineCSE
can figure out that an instruction is redundant after commuting it.

Review: Ulrich Weigand
2020-04-02 16:06:15 +02:00
Sanjay Patel 1008435f3d Revert "[InstCombine] do not exclude min/max from icmp with casted operand fold"
This reverts commit f2fbdf76d8.
As noted in the post-commit thread:
https://reviews.llvm.org/rGf2fbdf76d8d0
...this can obscure a min/max pattern where the components
have extra uses. We can show that the problem is independent
of this change with a slightly modified source example, so
this revert just delays/reduces the need to fix the real
problem.

We need to improve our analysis of negation or -- more
generally -- subtraction using patches like D77230 or D68408.
2020-04-02 09:15:23 -04:00
Tyker c00cb76274 [NFC] Split Knowledge retention and place it more appropriatly
Summary:
Splitting Knowledge retention into Queries in Analysis and Builder into Transform/Utils
allows Queries and Transform/Utils to use Analysis.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77171
2020-04-02 15:01:41 +02:00
Jonas Paulsson 36d4421f50 [LoopDataPrefetch + SystemZ] Let target decide on prefetching for each loop.
This patch adds

- New arguments to getMinPrefetchStride() to let the target decide on a
  per-loop basis if software prefetching should be done even with a stride
  within the limit of the hw prefetcher.

- New TTI hook enableWritePrefetching() to let a target do write prefetching
  by default (defaults to false).

- In LoopDataPrefetch:

  - A search through the whole loop to gather information before emitting any
    prefetches. This way the target can get information via new arguments to
    getMinPrefetchStride() and emit prefetches more selectively. Collected
    information includes: Does the loop have a call, how many memory
    accesses, how many of them are strided, how many prefetches will cover
    them. This is NFC to before as long as the target does not change its
    definition of getMinPrefetchStride().

  - If a previous access to the same exact address was 'read', and the
    current one is 'write', make it a 'write' prefetch.

  - If two accesses that are covered by the same prefetch do not dominate
    each other, put the prefetch in a block that dominates both of them.

  - If a ConstantMaxTripCount is less than ItersAhead, then skip the loop.

- A SystemZ implementation of getMinPrefetchStride().

Review: Ulrich Weigand, Michael Kruse

Differential Revision: https://reviews.llvm.org/D70228
2020-04-02 14:57:46 +02:00
Simon Pilgrim b02c7a8152 Fix "result of 32-bit shift implicitly converted to 64 bits" MSVC warning. NFCI.
The shift of 1 by an amount that is never more than 31 means that the warning is a false positive but is safe and fixes Werror builds.
2020-04-02 12:02:04 +01:00
David Green fbd53ffc3a [ARM] MVE VMULL patterns
This adds MVE vmull patterns, which are conceptually the same as
mul(vmovl, vmovl), and so the tablegen patterns follow the same
structure.

For i8 and i16 this is simple enough, but in the i32 version the
multiply (in 64bits) is illegal, meaning we need to catch the pattern
earlier in a dag fold. Because bitcasts are involved in the zext
versions and the patterns are a little different in little and big
endian. I have only added little endian support in this patch.

Differential Revision: https://reviews.llvm.org/D76740
2020-04-02 10:57:40 +01:00
David Green c697dd9ffd [ARM] Make remaining MVE instruction predictable
The unpredictable/hasSideEffects flag is usually inferred by tablegen
from whether the instruction has a tablegen pattern (and that pattern
only has a single output instruction). Now that the MVE intrinsics are
all committed and producing code, the remaining instructions still
marked as unpredictable need to be specially handled. This adds the flag
directly to instructions that need it, notably the V*MLAL instructions
and some of the MOV's.

Differential Revision: https://reviews.llvm.org/D76910
2020-04-02 10:57:40 +01:00
Guillaume Chatelet 96cae168fa [NFC] Preparatory work for D77292 2020-04-02 09:30:33 +00:00
Clement Courbet fb4aa30f27 [ExpandMemCmp] Allow overlaping loads in the zero-relational case.
Summary:
This allows doing `memcmp(p, q, 7)` with 2 loads instead of a call to
memcmp.
This fixes part of PR45147.

Reviewers: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76133
2020-04-02 11:20:47 +02:00
Florian Hahn a63b5c9e53 [CallSiteSplitting] Simplify isPredicateOnPHI & continue checking PHIs.
As pointed out by @thakis, currently CallSiteSplitting bails out after
checking the first PHI node. We should check all PHI nodes, until we
find one where call site splitting is beneficial.

This patch also slightly simplifies the code using BasicBlock::phis().

Reviewers: davidxl, junbuml, thakis

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D77089
2020-04-02 10:11:27 +01:00
Guillaume Chatelet 189d2e215f [Alignment][NFC] Use more Align versions of various functions
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: MatzeB, qcolombet, arsenm, sdardis, jvesely, nhaehnle, hiraditya, jrtc27, atanasyan, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77291
2020-04-02 09:00:53 +00:00
OCHyams 550ab58bc1 [NFC] Fix performance issue in LiveDebugVariables
When compiling AMDGPUDisassembler.cpp in a stage 1 trunk build with
CMAKE_BUILD_TYPE=RelWithDebInfo LLVM_USE_SANITIZER=Address LiveDebugVariables
accounts for 21.5% wall clock time. This fix reduces that to 1.2% by switching
out a linked list lookup with a map lookup.

Note that the linked list is still used to group UserValues by vreg. The vreg
lookups don't cause any problems in this pathological case.

This is the same idea as D68816, which was reverted, except that it is a less
intrusive fix.

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D77226
2020-04-02 09:39:33 +01:00
Djordje Todorovic 29d253c4c6 [Object] Add the method for checking if a section is a debug section
Different file formats have different naming style for the debug
sections. The method is implemented for ELF, COFF and Mach-O formats.

Differential Revision: https://reviews.llvm.org/D76276
2020-04-02 10:56:00 +02:00
WangTianQing d08fadd662 [X86] Add SERIALIZE instruction.
Summary: For more details about this instruction, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference

Reviewers: craig.topper, RKSimon, LuoYuanke

Reviewed By: craig.topper

Subscribers: mgorny, hiraditya, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77193
2020-04-02 16:19:23 +08:00
Shengchen Kan 9f92d4612f Revert "[NFC][X86] Refine code in X86AsmBackend"
This reverts commit a157cde0ac.
2020-04-02 15:57:06 +08:00
Shengchen Kan a157cde0ac [NFC][X86] Refine code in X86AsmBackend
Replace pattern getContents().size with universe function call
2020-04-02 15:41:10 +08:00
Johannes Doerfert 1858f4b50d Revert "[OpenMP][NFCI] Move OpenMP clause information to `lib/Frontend/OpenMP`"
This reverts commit c18d55998b.

Bots have reported uses that need changing, e.g.,
  clang-tools-extra/clang-tidy/openmp/UseDefaultNoneCheck.cp
as reported by
  http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/46591
2020-04-02 02:23:22 -05:00
Johannes Doerfert c18d55998b [OpenMP][NFCI] Move OpenMP clause information to `lib/Frontend/OpenMP`
This is a cleanup and normalization patch that also enables reuse with
Flang later on. A follow up will clean up and move the directive ->
clauses mapping.

Differential Revision: https://reviews.llvm.org/D77112
2020-04-02 01:39:07 -05:00
Fangrui Song cbd3969e8c [PPCInstPrinter] Delete an unneeded overload of printBranchOperand. NFC
It was added by D76591 for migration purposes (not all
printBranchOperand users have migrated to the overload with `uint64_t Address`).
Now that all have been migrated, the parameter can go away.
2020-04-01 22:45:25 -07:00
Fangrui Song 85adce3d73 [PPCInstPrinter] Change B to print the target address in hexadecimal form
Follow-up of D76591 and D76907
2020-04-01 22:38:24 -07:00
Johannes Doerfert bcd8009369 [Attributor] Use the proper context instruction in genericValueTraversal
There was a TODO in genericValueTraversal to provide the context
instruction and due to the lack of it users that wanted one just used
something available. Unfortunately, using a fixed instruction is wrong
in the presence of PHIs so we need to update the context instruction
properly.

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D76870
2020-04-01 22:20:47 -05:00
Johannes Doerfert ac96c8fd85 [Attributor][FIX] Do not compute ranges for arguments of declarations
This cannot be triggered right now, as far as I know, but it doesn't
make sense to deduce a constant range on arguments of declarations.
Exposed during testing of AAValueSimplify extensions.
2020-04-01 22:05:30 -05:00
Johannes Doerfert 54d6a608bf [Attributor][NFC] Predetermine the module
It could happen that we delete the first function in the SCC in the
future so we should be careful accessing `Functions` after the manifest
stage.
2020-04-01 21:56:17 -05:00
Johannes Doerfert 9e19693994 [Attributor] Derive better alignment for accessed pointers
Use DL & ABI information for better alignment deduction, e.g., if a type
is accessed and the ABI specifies an alignment requirement for such an
access we can use it. This is based on a patch by @lebedev.ri and
inspired by getBaseAlign in Loads.cpp.

Depends on D76673.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D76674
2020-04-01 21:49:57 -05:00
Nico Weber 5bac8d427d Revert "[ORC] Export __cxa_atexit from the main JITDylib in LLJIT."
This reverts commit 0071eaaf08.
Inputs/noop-main.ll wasn't checked in, so this breaks check-llvm
everywhere.
2020-04-01 22:49:38 -04:00
Johannes Doerfert b1c788d051 [Attributor][FIX] Prevent alignment breakage wrt. must-tail calls
If we have a must-tail call the callee and caller need to have matching
ABIs. Part of that is alignment which we might modify when we deduce
alignment of arguments of either. Since we would need to keep them in
sync, which is not as simple, we simply avoid deducing alignment for
arguments of the must-tail caller or callee.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D76673
2020-04-01 21:40:07 -05:00
Lang Hames 0071eaaf08 [ORC] Export __cxa_atexit from the main JITDylib in LLJIT.
Failure to export __cxa_atexit can lead to an attempt to import a definition
from the process itself (if __cxa_atexit is referenced from another JITDylib),
but the process definition will clash with the existing non-exported definition
to produce an unexpected DuplicateDefinitionError.

This patch fixes the immediate issue by exporting __cxa_atexit. It also fixes a
bug where atexit functions in other JITDylibs were not being run by adding a
copy of run_atexits_helper to every JITDylib.

A follow up patch will deal with the bug where definition generators are called
despite a non-exported definition being present.
2020-04-01 19:12:08 -07:00
Johannes Doerfert 41f2a57d0b [Attributor][NFC] Use a BumpPtrAllocator to allocate `AbstractAttribute`s
We create a lot of AbstractAttributes and they live as long as
the Attributor does. It seems reasonable to allocate them via a
BumpPtrAllocator owned by the Attributor.

Reviewed By: baziotis

Differential Revision: https://reviews.llvm.org/D76589
2020-04-01 20:53:28 -05:00
Sam Clegg 296ccef703 [WebAssembly] EmscriptenEHSjLj: Mark __invoke_ functions as imported
This means the linker will be expect them be undefined at link time an
will generate imports from the `env` module rather than reporting
undefined externals.

Differential Revision: https://reviews.llvm.org/D77192
2020-04-01 16:33:33 -07:00
Daniel Sanders e65e677ee4 [globalisel][legalizer] Fix DebugLoc bugs caught by a prototype lost-location verifier
The legalizer has a tendency to lose DebugLoc's when expanding or
combining instructions. The verifier that detected these isn't ready
for upstreaming yet but this patch fixes the cases that came up when
applying it to our out-of-tree backend's CodeGen tests.

This pattern comes up a few more times in this file and probably in
the backends too but I'd prefer to fix the others separately (and
preferably when the lost-location verifier detects them).
2020-04-01 12:50:18 -07:00
Lang Hames 8e5a8f620c [ORC] Don't require a null-terminator on MemoryBuffers for objects in archives.
The MemoryBuffer::getMemBuffer method's RequiresNullTerminator parameter
defaults to true, but object files are not null terminated so we need to
explicitly pass false here.
2020-04-01 12:16:38 -07:00
Sanjay Patel 3d90048791 [InstCombine] enhance freelyNegateValue() by handling xor
Negation is equivalent to bitwise-not + 1, so try to convert more
subtracts into adds using this relationship:
0 - (A ^ C) => ((A ^ C) ^ -1) + 1 => A ^ ~C + 1

I doubt this will recover the regression noted in rGf2fbdf76d8d0,
but seems like we're going to need to improve here and/or revive D68408?

Alive2 proofs:
http://volta.cs.utah.edu:8080/z/Re5tMU
http://volta.cs.utah.edu:8080/z/An-uns

Differential Revision: https://reviews.llvm.org/D77230
2020-04-01 15:05:13 -04:00
Jonathan Roelofs 1148f004fa Fix PR45371: SeparateConstOffsetFromGEP clean up bookkeeping
find() was altering the UserChain, even in cases where it subsequently
discovered that the resulting constant was a 0. This confuses
rebuildWithoutConstOffset() when it attempts to walk the chain later, since it
is expected that the chain itself be a path down the use-def edges of an
expression.
2020-04-01 12:38:15 -06:00
Nikita Popov 50a3e8738a Revert "[InstCombine] Erase old instruction when replacing extractelements"
This reverts commit d40368fdb5.

llvm-clang-x86_64-expensive-checks-debian failure looks related.
2020-04-01 20:10:11 +02:00
Nikita Popov 2a77544ad5 [SimplifyLibCalls] Erase replaced instructions
After RAUWing an instruction, also erase it. This makes sure we
don't perform extra InstCombine iterations to clean up the garbage.
2020-04-01 20:00:10 +02:00
Uday Bondhugula 6ee11c3b0f [NewGVN] Make NewGVN aware of aligned_alloc
Make the New GVN pass aware of aligned_alloc.

Depends on D76975.

Differential Revision: https://reviews.llvm.org/D76976
2020-04-01 23:26:51 +05:30
Uday Bondhugula 4cf70af94f [GVN] Make GVN aware of aligned_alloc
Make the GVN pass aware of aligned_alloc.

Depends on D76974.

Differential Revision: https://reviews.llvm.org/D76975
2020-04-01 23:26:50 +05:30
Uday Bondhugula c4499e3333 [Attributor] Make attributor aware of aligned_alloc for heap to stack conversion
Make the attributor pass aware of aligned_alloc for converting heap
allocations to stack ones.

Depends on D76971.

Differential Revision: https://reviews.llvm.org/D76974
2020-04-01 23:26:50 +05:30
Nikita Popov d40368fdb5 [InstCombine] Erase old instruction when replacing extractelements
As we are not returning the result of replaceInstUsesWith(),
so we need to clean up ourselves.

NFC apart from worklist order.
2020-04-01 19:55:28 +02:00
Nikita Popov 4b35c816ef [InstCombine] Use replaceOperand() in div transforms
To make sure the old operand is DCEd.

NFC apart from worklist order.
2020-04-01 19:55:00 +02:00
Matt Arsenault 5e4e8d0388 AMDGPU/GlobalISel: Change intrinsic ID for _L to _LZ opt
Still should handle the other case changes the opcode this way.
2020-04-01 13:03:02 -04:00
Heejin Ahn c87b5e7e22 [WebAssembly] Fix subregion relationship in CFGSort
Summary:
The previous code for determining the innermost region in CFGSort was
not correct. We determine subregion relationship by domination of their
headers, i.e., if region A's header dominates region B's header, B is a
subregion of A. Previously we assumed that if a BB belongs to both a
loop and an exception, the region with fewer number of BBs is the
innermost one. This may not be true, because while WebAssemblyException
contains BBs in all its subregions (loops or exceptions), MachineLoop
may not, because MachineLoop does not contain BBs that don't have a path
to its header even if they are dominated by its header.

                Loop header  <---|
                    |            |
              Exception header   |
                    | \          |
                    A  B         |
                    |   \        |
                    |    C       |
                    |            |
                Loop latch       |
                    |            |
                    -------------|

For example, in this CFG, the loop does not contain B and C, because
they don't have a path back to the loops header. But for CFGSort we
consider the exception here belongs to the loop and the exception should
be a subregion of the loop and scheduled together.

So here we should use `WE->contains(ML->getHeader())` (but not
`ML->contains(WE->getHeader())`, for the stated region above).

This also fixes some comments and deletes `Regions` vector in
`RegionInfo` class, which was not used anywere.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77181
2020-04-01 08:12:41 -07:00
Jessica Clarke 616289ed29 [LegalizeTypes][RISCV] Correctly sign-extend comparison for ATOMIC_CMP_XCHG
Summary:
Currently, the comparison argument used for ATOMIC_CMP_XCHG is legalised
with GetPromotedInteger, which leaves the upper bits of the value
undefind. Since this is used for comparing in an LR/SC loop with a
full-width comparison, we must sign extend it. We introduce a new
getExtendForAtomicCmpSwapArg to complement getExtendForAtomicOps, since
many targets have compare-and-swap instructions (or pseudos) that
correctly handle an any-extend input, and the existing function
determines the extension of the result, whereas we are concerned with
the input.

This is related to https://reviews.llvm.org/D58829, which solved the
issue for ATOMIC_CMP_SWAP_WITH_SUCCESS, but not the simpler
ATOMIC_CMP_SWAP.

Reviewers: asb, lenary, efriedma

Reviewed By: asb

Subscribers: arichardson, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, evandro, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74453
2020-04-01 15:51:26 +01:00
Guillaume Chatelet fc63c4d8ce [Alignment][NFC] Remove remaining uses of MachineFrameInfo::setObjectAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77217
2020-04-01 14:38:05 +00:00
Simon Pilgrim eb8880562e [X86][SSE] combinePTESTCC - fold TESTZ(X,~Y) -> TESTC(Y,X) 2020-04-01 15:10:53 +01:00
Guillaume Chatelet 1dffa2550b [Alignment][NFC] Transition to MachineFrameInfo::getObjectAlign()
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77215
2020-04-01 14:08:28 +00:00
Kai Wang 501522b5b2 [RISCV] Support RISC-V ELF attributes sections in llvm-readobj.
Enable llvm-readobj to handle RISC-V ELF attribute sections.

Differential Revision: https://reviews.llvm.org/D75833
2020-04-01 21:50:11 +08:00
Simon Pilgrim be7a233e93 Fix operator precedence warning. NFCI. 2020-04-01 14:36:52 +01:00
Simon Pilgrim 552e46ea1e Fix unused variable warnings. NFCI. 2020-04-01 14:36:51 +01:00
Benjamin Kramer b605c56b0f [ARM] Silence warning in Release builds
llvm/lib/Target/ARM/MVEVPTBlockPass.cpp:175:37: error: unused variable 'BlockBeg' [-Werror,-Wunused-variable]
  MachineBasicBlock::instr_iterator BlockBeg = Iter;
                                    ^
2020-04-01 15:29:19 +02:00
Guillaume Chatelet 3a78f44daf [Alignment][NFC] Convert SelectionDAG::InferPtrAlignment to MaybeAlign
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77212
2020-04-01 13:22:11 +00:00
Simon Pilgrim 481413d394 [X86][SSE] matchShuffleWithPACK - generalize zero/signbits matching for any packed src type
First step toward making use of canLowerByDroppingEvenElements to match chains of PACKSS/PACKUS for compaction shuffles.

At the moment we still only match a single stage but the MatchPACK is now more general.
2020-04-01 14:10:32 +01:00
shchenz e344f8b9db Revert "[LSR] re-add testcase for wrongly phi node elimination - NFC"
This reverts commit f25a1b4f58.
ARM and hexagon fail at the new added case.
2020-04-01 12:58:06 +00:00
Guillaume Chatelet bf573bea19 [Alignment][NFC] Convert MIR Yaml to MaybeAlign
Summary:
Although it may look like non NFC it is. especially the MIRParser may set `0` to the MachineFrameInfo and MachineFunction, but they all deal with `Align` internally and assume that `0` means `1`.
93fc0ba145/llvm/include/llvm/CodeGen/MachineFrameInfo.h (L483)

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77203
2020-04-01 12:26:31 +00:00
Pierre-vh 2effe8f5e7 [Target][ARM] Improvements to the VPT Block Insertion Pass
This allows the MVE VPT Block insertion pass to remove VPNOTs in
order to create more complex VPT blocks such as TE, TEET, TETE, etc.

Differential Revision: https://reviews.llvm.org/D75993
2020-04-01 12:34:20 +01:00
Pierre-vh dad848280d [Target][ARM] Change VPTMaskValues to the correct encoding
VPTMaskValue was using the "instruction" encoding to represent the masks
(= the same encoding as the one used by the instructions in an object file),
but it is only used to build MCOperands, so it should use the MCOperand
encoding of the masks, which is slightly different.

Differential Revision: https://reviews.llvm.org/D76139
2020-04-01 12:34:20 +01:00
Benjamin Kramer 66b9f5f7f0 [GVNSink] Simplify code. NFC. 2020-04-01 13:13:00 +02:00
shchenz f25a1b4f58 [LSR] re-add testcase for wrongly phi node elimination - NFC
Retest the case on X86/SystemZ/AArch64/PowerPC
2020-04-01 11:11:17 +00:00