Commit Graph

194579 Commits

Author SHA1 Message Date
Wei Mi b49eac71ad Recommit [SampleFDO] Add flag for partial profile.
Fix the error of show-prof-info.test on some platforms without zlib.

The common profile usage is to collect profile from a target and then use the profile to guide the optimized build for the same target. There are some cases that no profile can be collected for a target. In those cases, although no full profile is available, it is possible to have some partial profile collected from other targets to optimize common libraries and utilities. A flag is needed to tell the partial profile from the full profile apart, so compiler can use different strategy for them.

Differential Revision: https://reviews.llvm.org/D77426
2020-04-07 14:28:25 -07:00
Stanislav Mekhanoshin 96e51ed005 [AMDGPU] Implement copyPhysReg for 16 bit subregs
Differential Revision: https://reviews.llvm.org/D74937
2020-04-07 14:22:46 -07:00
Matt Arsenault 2481f26ac3 CodeGen: Use Register in TargetFrameLowering 2020-04-07 17:07:44 -04:00
Nikita Popov fe8abbf442 [BPI] Clear handles when releasing memory (NFC)
This reduces max-rss of sqlite compilation by 2.5%.
2020-04-07 22:51:01 +02:00
David Blaikie da4ffc64e4 Remove some top-level const from return values seen in review 2020-04-07 13:22:22 -07:00
George Burgess IV ff30d01522 [TLI] fix a function's (commented) signature; NFC
__strlen_chk returns a `size_t`, not a `char *`.
2020-04-07 13:04:54 -07:00
Matt Arsenault aa26dd9858 CodeGen: Use Register in more places 2020-04-07 15:59:40 -04:00
Wei Mi c5da949ae8 Revert "[SampleFDO] Add flag for partial profile." show-prof-info.test breaks on some platforms.
This reverts commit e3ba652a14.
2020-04-07 12:54:51 -07:00
Wei Mi e3ba652a14 [SampleFDO] Add flag for partial profile.
The common profile usage is to collect profile from a target and then use the profile to guide the optimized build for the same target. There are some cases that no profile can be collected for a target. In those cases, although no full profile is available, it is possible to have some partial profile collected from other targets to optimize common libraries and utilities. A flag is needed to tell the partial profile from the full profile apart, so compiler can use different strategy for them.

Differential Revision: https://reviews.llvm.org/D77426
2020-04-07 12:17:56 -07:00
Nemanja Ivanovic ecd8435483 [NFC][PowerPC] Fix register class for patterns using XXPERMDIs
There are a few patterns where we use a superclass for inputs to this
instruction rather than the correct class. This can sometimes lead to
unncessary copies.
2020-04-07 14:06:08 -05:00
Graham Sellers a19a56f6a1 [AMDGPU] Extend constant folding for logical operations
This patch extends existing constant folding in logical operations to
handle S_XNOR, S_NAND, S_NOR, S_ANDN2, S_ORN2, V_LSHL_ADD_U32 and
V_AND_OR_B32. Also added a couple of tests for existing folds.
2020-04-07 14:37:16 -04:00
Craig Topper c41685b16f [SelectionDAG] Make getZeroExtendInReg take a vector VT if the operand VT is a vector.
This removes a call to getScalarType from a bunch of call sites.
It also makes the behavior consistent with SIGN_EXTEND_INREG.

Differential Revision: https://reviews.llvm.org/D77631
2020-04-07 11:34:08 -07:00
LLVM GN Syncbot 1a28d33f37 [gn build] Port 88c2137b6d 2020-04-07 18:26:53 +00:00
Alexey Lapshin 88c2137b6d [DWARFLinker][dsymutil][NFC] Move DwarfStreamer into DWARFLinker.
For implementing "remove obsolete debug info in lld", it is neccesary
to have DWARF generation code implementation. dsymutil uses DwarfStreamer
for that purpose. DwarfStreamer uses AsmPrinter. It is considered OK
to use AsmPrinter based code in lld(D74169). This patch moves
DwarfStreamer implementation into DWARFLinker, so that it could be reused
from lld.

Generally, a better place for such a common DWARF generation code would be
not DWARFLinker but an additional separate library. Such a library could
contain a single version of DWARF generation routines and could also
be independent of AsmPrinter. At the current moment, DwarfStreamer
does not pretend to be such a general implementation of DWARF generation.
So I decided to put it into DWARFLinker since it is the only user
of DwarfStreamer.

Testing: it passes "check-all" lit testing. MD5 checksum for clang .dSYM
bundle matches for the dsymutil with/without that patch.

Reviewed By: JDevlieghere

Differential revision: https://reviews.llvm.org/D77169
2020-04-07 21:21:54 +03:00
Matt Arsenault f524194ffd AMDGPU: Cleanup test MIR 2020-04-07 14:09:25 -04:00
Eli Friedman e9ac757f79 [AArch64] Don't expand memcmp in strict align mode.
7aecf232 fixed the bug where we would miscompile, but we still generate
a crazy amount of code. Turn off the expansion until someone implements
an appropriate heuristic.

Differential Revision: https://reviews.llvm.org/D77599
2020-04-07 10:53:36 -07:00
Matt Arsenault f596ab4066 AMDGPU: Use early return 2020-04-07 13:48:00 -04:00
Julian Lettner eb5ca295d7 [lit] Cleanup printing of discovered suites and tests 2020-04-07 10:39:35 -07:00
Sam Clegg 5be42f36f5 [WebAssembly][MC] Fix leak of std::string members in MCSymbolWasm
Summary: Fixes: https://bugs.llvm.org/show_bug.cgi?id=45452

Subscribers: dschuff, jgravelle-google, hiraditya, aheejin, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77627
2020-04-07 10:38:43 -07:00
Stanislav Mekhanoshin 12a324393d [AMDGPU] Limit endcf-collapase to simple if
We can only collapse adjacent SI_END_CF if outer statement
belongs to a simple SI_IF, otherwise correct mask is not in the
register we expect, but is an argument of an S_XOR instruction.

Even if SI_IF is simple it might be lowered using S_XOR because
lowering is dependent on a basic block layout. It is not
considered simple if instruction consuming its output is
not an SI_END_CF. Since that SI_END_CF might have already been
lowered to an S_OR isSimpleIf() check may return false.

This situation is an opportunity for a further optimization
of SI_IF lowering, but that is a separate optimization. In the
meanwhile move SI_END_CF post the lowering when we already know
how the rest of the CFG was lowered since a non-simple SI_IF
case still needs to be handled.

Differential Revision: https://reviews.llvm.org/D77610
2020-04-07 10:27:23 -07:00
Aaron Ballman 95eb50c447 Check LLVM_BUILD_LLVM_C_DYLIB before building the C DLL with MSVC. 2020-04-07 13:13:58 -04:00
Simon Pilgrim 6f46e9af8a [X86][SSE] Add PTEST(AND(X,Y),AND(X,Y)) tests derived from PR42035 examples 2020-04-07 17:58:54 +01:00
Matt Arsenault b281138a1b DAG: Use the correct getPointerTy in a few places
These should not be assuming address space 0. Calling getPointerTy is
generally the wrong thing to do, since you should already know the
type from the incoming IR.
2020-04-07 12:45:41 -04:00
Nikita Popov 259649a519 [RDA] Avoid full reprocessing of blocks in loops (NFCI)
RDA sometimes needs to visit blocks twice, to take into account
reaching defs coming in along loop back edges. Currently it handles
repeated visitation the same way as usual, which means that it will
scan through all instructions and their reg unit defs again. Not
only is this very inefficient, it also means that all reaching defs
in loops are going to be inserted twice.

We can do much better than this. The only thing we need to handle
is a new reaching def from a predecessor, which either needs to be
prepended to the reaching definitions (if there was no reaching def
from a predecessor), or needs to replace an existing predecessor
reaching def, if it is more recent. Since D77508 we only store the
most recent predecessor reaching def, so that's the only one that
may need updating.

This also has the nice side-effect that reaching definitions are
now automatically sorted and unique, so drop the llvm::sort() call
in favor of an assertion.

Differential Revision: https://reviews.llvm.org/D77511
2020-04-07 17:55:37 +02:00
Nikita Popov 76e987b372 [RDA] Don't pass down TraversedMBB (NFC)
Only pass the MachineBasicBlock itself down to helper methods,
they don't need to know about traversal. Move the debug print
into the main method.
2020-04-07 17:53:04 +02:00
Nikita Popov 361c29d7ba [RDA] Avoid inserting duplicate reaching defs (NFCI)
An instruction may define the same reg unit multiple times,
avoid inserting the same reaching def multiple times in that case.

Also print the reg unit, rather than the super-register, in the
debug code.
2020-04-07 17:50:38 +02:00
David Tenty b9245f14b7 [NFC][PowerPC] Cleanup 64-bit and Darwin CalleeSavedRegs
Summary:
- Remove the no longer used Darwin CalleeSavedRegs
- Combine the SVR464 callee saved regs and AIX64 since the two are (and should be) identical into PPC64
- Update tests for 64-bit CSR change

Reviewers: sfertile, ZarkoCA, cebowleratibm, jasonliu, #powerpc

Reviewed By: sfertile

Subscribers: wuzish, nemanjai, hiraditya, kbarton, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77235
2020-04-07 11:49:10 -04:00
diggerlin 3aa084947e [NFC][XCOFF] refactor readobj/XCOFFDumper.cpp
SUMMARY:

refactor readobj/XCOFFDumper.cpp with helper function getAlignmentLog2() , getSymbolType(), isLabel().

Reviewers: Hubert Tong, James Henderson
Subscribers: rupprecht, seiyai,hiradityu

Differential Revision: https://reviews.llvm.org/D77562
2020-04-07 11:33:31 -04:00
Simon Pilgrim e3b6059776 [X86][SSE] combineX86ShufflesConstants - early out for zeroable vectors (PR45443)
Shuffle combining can insert zero byte sized elements into the shuffle mask, which combineX86ShufflesConstants will attempt to fold without taking into account whether the byte-sized type is legal (e.g. AVX512F only targets).

If we have a full-zeroable vector then we should just return a zero version of the root type, otherwise if the type isn't valid we should bail.

Fixes PR45443
2020-04-07 14:45:29 +01:00
Nico Weber 448b777b86 Stop passing site cfg files via --param to llvm-lit.
This has been unnecessary since https://reviews.llvm.org/D37756.

https://reviews.llvm.org/D37838 removed it for llvm.

This removes it from clang, lld, clang-tools-extra (and the GN build).

No intended behavior change.

Differential Revision: https://reviews.llvm.org/D77585
2020-04-07 08:20:40 -04:00
Georgii Rymar 7fc599ceb0 [llvm-readobj] - Introduce warnings for cases when unable to read strings from string tables.
Currently we have no dedicated warnings, but we return error message instead of a result.
It is generally not consistent with another warnings we have.

This change was suggested and discussed here:
https://reviews.llvm.org/D77216#1954873

This change refines error messages we report and also I had to update the API
to implement it.

Differential revision: https://reviews.llvm.org/D77399
2020-04-07 14:40:32 +03:00
Sanjay Patel e268ec8e0d [InstCombine] add icmp+cast tests for ppc_fp128; NFC
See post-commit comments for rG0f56bbc.
2020-04-07 07:35:01 -04:00
Keith Walker 01dc10774e [ARM] unwinding .pad instructions missing in execute-only prologue
If the stack pointer is altered for local variables and we are generating
Thumb2 execute-only code the .pad directive is missing.

Usually the size of the adjustment is stored in a PC-relative location
and loaded into a register which is then added to the stack pointer.
However when we are generating execute-only code code the size of the
adjustment is instead generated using the MOVW/MOVT instruction pair.

As a by product of handling the execute-only case this also fixes an
existing issue that in the none execute-only case the .pad directive was
generated against the load of the constant to a register instruction,
instead of the instruction which adds the register to the stack pointer.

Differential Revision: https://reviews.llvm.org/D76849
2020-04-07 11:51:59 +01:00
Florian Hahn 6aabb109be [SCCP] Use ranges for predicate info conditions.
This patch updates the code that deals with conditions from predicate
info to make use of constant ranges.

For ssa_copy instructions inserted by PredicateInfo, we have 2 ranges:
1. The range of the original value.
2. The range imposed by the linked condition.

1. is known, 2. can be determined using makeAllowedICmpRegion. The
intersection of those ranges is the range for the copy.

With this patch, we get a nice increase in the number of instructions
eliminated by both SCCP and IPSCCP for some benchmarks:

For MultiSource, SPEC2000 & SPEC2006:

Tests: 237
Same hash: 170 (filtered out)
Remaining: 67
Metric: sccp.NumInstRemoved
Program                                        base    patch   diff
 test-suite...Source/Benchmarks/sim/sim.test    10.00   71.00  610.0%
 test-suite...CFP2000/177.mesa/177.mesa.test   361.00  1626.00 350.4%
 test-suite...encode/alacconvert-encode.test   141.00  602.00  327.0%
 test-suite...decode/alacconvert-decode.test   141.00  602.00  327.0%
 test-suite...CI_Purple/SMG2000/smg2000.test   1639.00 4093.00 149.7%
 test-suite...peg2/mpeg2dec/mpeg2decode.test    75.00  163.00  117.3%
 test-suite...T2006/401.bzip2/401.bzip2.test   358.00  513.00  43.3%
 test-suite...rks/FreeBench/pifft/pifft.test    11.00   15.00  36.4%
 test-suite...langs-C/unix-tbl/unix-tbl.test     4.00    5.00  25.0%
 test-suite...lications/sqlite3/sqlite3.test   541.00  667.00  23.3%
 test-suite.../CINT2000/254.gap/254.gap.test   243.00  299.00  23.0%
 test-suite...ks/Prolangs-C/agrep/agrep.test    25.00   29.00  16.0%
 test-suite...marks/7zip/7zip-benchmark.test   1135.00 1304.00 14.9%
 test-suite...lications/ClamAV/clamscan.test   1105.00 1268.00 14.8%
 test-suite...urce/Applications/lua/lua.test   398.00  436.00   9.5%

Metric: sccp.IPNumInstRemoved
Program                                        base   patch   diff
 test-suite...C/CFP2000/179.art/179.art.test     1.00   3.00  200.0%
 test-suite...006/447.dealII/447.dealII.test   429.00 1056.00 146.2%
 test-suite...nch/fourinarow/fourinarow.test     3.00   7.00  133.3%
 test-suite...CI_Purple/SMG2000/smg2000.test   818.00 1748.00 113.7%
 test-suite...ks/McCat/04-bisect/bisect.test     3.00   5.00  66.7%
 test-suite...CFP2000/177.mesa/177.mesa.test   165.00 255.00  54.5%
 test-suite...ediabench/gsm/toast/toast.test    18.00  27.00  50.0%
 test-suite...telecomm-gsm/telecomm-gsm.test    18.00  27.00  50.0%
 test-suite...ks/Prolangs-C/agrep/agrep.test    24.00  35.00  45.8%
 test-suite...TimberWolfMC/timberwolfmc.test    43.00  62.00  44.2%
 test-suite...encode/alacconvert-encode.test    46.00  66.00  43.5%
 test-suite...decode/alacconvert-decode.test    46.00  66.00  43.5%
 test-suite...langs-C/unix-tbl/unix-tbl.test    12.00  17.00  41.7%
 test-suite...peg2/mpeg2dec/mpeg2decode.test    31.00  41.00  32.3%
 test-suite.../CINT2000/254.gap/254.gap.test   117.00 154.00  31.6%

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D76611
2020-04-07 11:09:18 +01:00
Djordje Todorovic 3a4d9f8335 [docs] Add the release notes about Debug Entry Values
Note that x86, arm and aarch64 targets support the Debug Entry Values
feature by default.

Differential Revision: https://reviews.llvm.org/D77494
2020-04-07 12:08:22 +02:00
Serguei Katkov b7e3759e17 [DAG] Consolidate require spill slot logic in lambda. NFC.
Move the logic whether lowering of deopt value requires a spill slot in
a separate lambda.

Reviewers: reames, dantrushin
Reviewed By: dantrushin
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D77629
2020-04-07 16:43:47 +07:00
Peter Smith 14c1e98754 [ARM] Remove condition that could never be true
From Arm v8 Architecture Reference Manual F5.1.84 LDREXD
The ldrexd instruction in Arm state has the following conditions:

t = UInt(Rt); t2 = t + 1; n = UInt(Rn);
if Rt<0> == '1' || t2 == 15 || n == 15 then UNPREDICTABLE;

In when Rt is odd or if Rt is 14 (making t2 15).

In the implementation when the pair is the UNPREDICTABLE R14_R15 we
would ideally return SOFT_FAIL. We can't because there is no R14_R15
value for us to return so we fail early returning FAIL.

The early return for registers outside the bounds of the table means
the check for Rt == 14 (0xE) redundant which causes a static analyzer
to flag the condition as never being true.

To fix the warning I've removed the check and replaced with a comment
explaining the difference with the specification.

Fixes pr41660

Differential Revision: https://reviews.llvm.org/D77463
2020-04-07 09:50:56 +01:00
Simon Tatham aab9e9de4d [Support,Windows] Tolerate failure of CryptGenRandom
Summary:
In `Unix/Process.inc`, we seed a random number generator from
`/dev/urandom` if possible, but if not, we're happy to fall back to
ordinary pseudorandom strategies, like the current time and PID.

The corresponding function on Windows calls `CryptGenRandom`, but it
//doesn't// have a fallback if that strategy fails. But `CryptGenRandom`
//can// fail, if a cryptography provider isn't properly initialized, or
occasionally (by our observation) simply intermittently.

If it's reasonable on Unix to implement traditional pseudorandom-number
seeding as a fallback, then it's surely reasonable to do the same on
Windows. So this patch adds a last-ditch use of ordinary rand(), using
much the same strategy as the Unix fallback code.

Reviewers: hans, sammccall

Reviewed By: hans

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77553
2020-04-07 09:18:12 +01:00
Pierre-vh 4fc59a468f Revert "[CodeGen][SelectionDAG] Flip Booleans More Often"
This reverts commit 23342bdcc8.
2020-04-07 09:09:10 +01:00
Pierre-vh 23342bdcc8 [CodeGen][SelectionDAG] Flip Booleans More Often
Differential Revision: https://reviews.llvm.org/D77201
2020-04-07 08:19:57 +01:00
Awanish Pandey 0d43e1688a [DWARF5]: Added a left over test case from D73462
Unfortunately this test case never made it to the trunk. This
was part of https://reviews.llvm.org/D73462 revision.
2020-04-07 10:32:56 +05:30
Sam Clegg f0bbf3d086 [WebAssembly] EmscriptenEHSjLj: Mark more functions as imported
These should have been part of https://reviews.llvm.org/D77192

Differential Revision: https://reviews.llvm.org/D77358
2020-04-06 21:27:31 -07:00
Julian Lettner 38edab1c40 [lit] Improve handling of parallelism group semaphores 2020-04-06 20:52:06 -07:00
Kai Luo 68ef0b6a49 [PowerPC] Pre-commit test case of float rounding in kernel build. NFC. 2020-04-07 02:07:58 +00:00
Xiang1 Zhang 01a32f2bd3 Enable IBT(Indirect Branch Tracking) in JIT with CET(Control-flow Enforcement Technology)
Do not commit the llvm/test/ExecutionEngine/MCJIT/cet-code-model-lager.ll because it will
cause build bot fail(not suitable for window 32 target).

Summary:
This patch comes from H.J.'s 2bd54ce7fa

**This patch fix the failed llvm unit tests which running on CET machine. **(e.g. ExecutionEngine/MCJIT/MCJITTests)

The reason we enable IBT at "JIT compiled with CET" is mainly that:  the JIT don't know the its caller program is CET enable or not.
If JIT's caller program is non-CET, it is no problem JIT generate CET code or not.
But if JIT's caller program is CET enabled,  JIT must generate CET code or it will cause Control protection exceptions.

I have test the patch at llvm-unit-test and llvm-test-suite at CET machine. It passed.
and H.J. also test it at building and running VNCserver(Virtual Network Console), it works too.
(if not apply this patch, VNCserver will crash at CET machine.)

Reviewers: hjl.tools, craig.topper, LuoYuanke, annita.zhang, pengfei

Reviewed By: LuoYuanke

Subscribers: tstellar, efriedma, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76900
2020-04-07 09:48:47 +08:00
Jun Ma 46bff786bc [Coroutines] Remove alignment check in shouldBeMustTail
Differential Revision: https://reviews.llvm.org/D77362
2020-04-07 09:07:34 +08:00
Eli Friedman 3f13ee8a00 [NFC] Modernize misc. uses of Align/MaybeAlign APIs.
Use the current getAlign() APIs where it makes sense, and use Align
instead of MaybeAlign when we know the value is non-zero.
2020-04-06 17:53:04 -07:00
Nico Weber e613f0ee8d Reland "Make llvm_source_root in llvm-lit relative too."
This reverts commit 3185881d69
and adds a missing "include(AddLLVM)" (similar lines already
exist elsewhere in compiler-rt).
2020-04-06 20:49:10 -04:00
Eli Friedman 68b03aee1a Remove SequentialType from the type heirarchy.
Now that we have scalable vectors, there's a distinction that isn't
getting captured in the original SequentialType: some vectors don't have
a known element count, so counting the number of elements doesn't make
sense.

In some cases, there's a better way to express the commonality using
other methods. If we're dealing with GEPs, there's GEP methods; if we're
dealing with a ConstantDataSequential, we can query its element type
directly.

In the relatively few remaining cases, I just decided to write out
the type checks. We're talking about relatively few places, and I think
the abstraction doesn't really carry its weight. (See thread "[RFC]
Refactor class hierarchy of VectorType in the IR" on llvmdev.)

Differential Revision: https://reviews.llvm.org/D75661
2020-04-06 17:03:49 -07:00
Davide Italiano 8115e08b05 [MachineCSE] Don't carry the wrong location when hoisting
PR: 45425
<rdar://problem/61359768>

Differential Revision:  https://reviews.llvm.org/D77604
2020-04-06 16:36:22 -07:00