Commit Graph

377131 Commits

Author SHA1 Message Date
Kazu Hirata 125ea20d55 [llvm] Use llvm::stable_sort (NFC) 2021-01-13 19:14:43 -08:00
Kazu Hirata 5c1c39e8d8 [llvm] Use *Set::contains (NFC) 2021-01-13 19:14:41 -08:00
Duncan P. N. Exon Smith 56d1ffb927 Revert "ADT: Fix reference invalidation in SmallVector::push_back and single-element insert"
This reverts commit 9abac60309 since there
are some bot errors on Windows:
http://lab.llvm.org:8011/#/builders/127/builds/4489

```
FAILED: lib/Support/CMakeFiles/LLVMSupport.dir/IntervalMap.cpp.obj
C:\PROGRA~2\MIB055~1\2017\PROFES~1\VC\Tools\MSVC\1416~1.270\bin\Hostx64\x64\cl.exe  /nologo /TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib\Support -IC:\b\slave\sanitizer-windows\llvm-project\llvm\lib\Support -Iinclude -IC:\b\slave\sanitizer-windows\llvm-project\llvm\include /DWIN32 /D_WINDOWS   /Zc:inline /Zc:__cplusplus /Zi /Zc:strictStrings /Oi /Zc:rvalueCast /bigobj /W4 -wd4141 -wd4146 -wd4244 -wd4267 -wd4291 -wd4351 -wd4456 -wd4457 -wd4458 -wd4459 -wd4503 -wd4624 -wd4722 -wd4100 -wd4127 -wd4512 -wd4505 -wd4610 -wd4510 -wd4702 -wd4245 -wd4706 -wd4310 -wd4701 -wd4703 -wd4389 -wd4611 -wd4805 -wd4204 -wd4577 -wd4091 -wd4592 -wd4319 -wd4709 -wd4324 -w14062 -we4238 /Gw /MD /O2 /Ob2 -UNDEBUG -std:c++14  /EHs-c- /GR- /showIncludes /Folib\Support\CMakeFiles\LLVMSupport.dir\IntervalMap.cpp.obj /Fdlib\Support\CMakeFiles\LLVMSupport.dir\LLVMSupport.pdb /FS -c C:\b\slave\sanitizer-windows\llvm-project\llvm\lib\Support\IntervalMap.cpp
C:\b\slave\sanitizer-windows\llvm-project\llvm\include\llvm/ADT/SmallVector.h(746): error C2672: 'llvm::SmallVectorImpl<T>::insert_one_maybe_copy': no matching overloaded function found
        with
        [
            T=llvm::IntervalMapImpl::Path::Entry
        ]
C:\b\slave\sanitizer-windows\llvm-project\llvm\include\llvm/ADT/SmallVector.h(745): note: while compiling class template member function 'llvm::IntervalMapImpl::Path::Entry *llvm::SmallVectorImpl<T>::insert(llvm::IntervalMapImpl::Path::Entry *,T &&)'
        with
        [
            T=llvm::IntervalMapImpl::Path::Entry
        ]
C:\b\slave\sanitizer-windows\llvm-project\llvm\lib\Support\IntervalMap.cpp(22): note: see reference to function template instantiation 'llvm::IntervalMapImpl::Path::Entry *llvm::SmallVectorImpl<T>::insert(llvm::IntervalMapImpl::Path::Entry *,T &&)' being compiled
        with
        [
            T=llvm::IntervalMapImpl::Path::Entry
        ]
C:\b\slave\sanitizer-windows\llvm-project\llvm\include\llvm/ADT/SmallVector.h(1136): note: see reference to class template instantiation 'llvm::SmallVectorImpl<T>' being compiled
        with
        [
            T=llvm::IntervalMapImpl::Path::Entry
        ]
C:\b\slave\sanitizer-windows\llvm-project\llvm\include\llvm/ADT/IntervalMap.h(790): note: see reference to class template instantiation 'llvm::SmallVector<llvm::IntervalMapImpl::Path::Entry,4>' being compiled
C:\b\slave\sanitizer-windows\llvm-project\llvm\include\llvm/ADT/SmallVector.h(746): error C2783: 'llvm::IntervalMapImpl::Path::Entry *llvm::SmallVectorImpl<T>::insert_one_maybe_copy(llvm::IntervalMapImpl::Path::Entry *,ArgType &&)': could not deduce template argument for '__formal'
        with
        [
            T=llvm::IntervalMapImpl::Path::Entry
        ]
C:\b\slave\sanitizer-windows\llvm-project\llvm\include\llvm/ADT/SmallVector.h(727): note: see declaration of 'llvm::SmallVectorImpl<T>::insert_one_maybe_copy'
        with
        [
            T=llvm::IntervalMapImpl::Path::Entry
        ]
```
2021-01-13 19:04:20 -08:00
Arthur Eubanks b196dc6607 [NFC] Remove unused entry in PassRegistry.def 2021-01-13 19:01:07 -08:00
Duncan P. N. Exon Smith 9abac60309 ADT: Fix reference invalidation in SmallVector::push_back and single-element insert
For small enough, trivially copyable `T`, take the argument by value in
`SmallVector::push_back` and copy it when forwarding to
`SmallVector::insert_one_impl`. Otherwise, when growing, update the
argument appropriately.

Differential Revision: https://reviews.llvm.org/D93779
2021-01-13 18:58:24 -08:00
Alexandre Ganea eec856848c Revert "[Support] On Windows, take the affinity mask into account"
This reverts commit 336ab2d51d.
2021-01-13 21:34:54 -05:00
Esme-Yi ff40fb07ad [PowerPC] Try to fold sqrt/sdiv test results with the branch.
Summary: The patch tries to fold sqrt/sdiv test node, i.g FTSQRT, XVTDIVDP, and the branch, i.e br_cc if they meet these patterns:
(br_cc seteq, (truncateToi1 SWTestOp), 0) -> (BCC PRED_NU, SWTestOp)
(br_cc seteq, (and SWTestOp, 2), 0) -> (BCC PRED_NE, SWTestOp)
(br_cc seteq, (and SWTestOp, 4), 0) -> (BCC PRED_LE, SWTestOp)
(br_cc seteq, (and SWTestOp, 8), 0) -> (BCC PRED_GE, SWTestOp)
(br_cc setne, (truncateToi1 SWTestOp), 0) -> (BCC PRED_UN, SWTestOp)
(br_cc setne, (and SWTestOp, 2), 0) -> (BCC PRED_EQ, SWTestOp)
(br_cc setne, (and SWTestOp, 4), 0) -> (BCC PRED_GT, SWTestOp)
(br_cc setne, (and SWTestOp, 8), 0) -> (BCC PRED_LT, SWTestOp)

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D94054
2021-01-14 02:15:19 +00:00
Alexandre Ganea 336ab2d51d [Support] On Windows, take the affinity mask into account
The number of hardware threads available to a ThreadPool can be limited if setting an affinity mask.
For example:

> start /B /AFFINITY 0xF lld-link.exe ...

Would let LLD only use 4 hyper-threads.

Previously, there was an outstanding issue on Windows Server 2019 on dual-CPU machines, which was preventing from using both CPU sockets. In normal conditions, when no affinity mask was set, ProcessorGroup::AllThreads was different from ProcessorGroup::UsableThreads. The previous code in llvm/lib/Support/Windows/Threading.inc L201 was improperly assuming those two values to be equal, and consequently was limiting the execution to only one CPU socket.

Differential Revision: https://reviews.llvm.org/D92419
2021-01-13 21:00:09 -05:00
Richard Smith cd4c55c974 Fix grammar in diagnostic for wrong arity in a structured binding. 2021-01-13 17:41:09 -08:00
Craig Topper dfc1901d51 [RISCV] Custom lower ISD::VSCALE.
This patch custom lowers ISD::VSCALE into a csrr vlenb followed
by a shift right by 3 followed by a multiply by the scale amount.

I've added computeKnownBits support to indicate that the csrr vlenb
always produces 3 trailng bits of 0s so the shift right is "exact".
This allows the shift and multiply sequence to be nicely optimized
into a single shift or removed completely when the scale amount is
a power of 2.

The non power of 2 case multiplying by 24 is still producing
suboptimal code. We could remove the right shift and use a
multiply by 3. Hopefully we can improve DAG combine to fix that
since it's not unique to this sequence.

This replaces D94144.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D94249
2021-01-13 17:14:49 -08:00
Peter Steinfeld 3de92ca78c [flang] Add tests for procedure arguments with implicit interfaces
It's possible to declare an external procedure and then pass it as an
actual argument to a subprogram expecting a procedure argument.  I added
tests for this and added an error message to distinguish passing an
actual argument with an implicit interface from passing an argument with
a mismatched explicit interface.

Differential Revision: https://reviews.llvm.org/D94505
2021-01-13 16:43:09 -08:00
Ryan Prichard c82deed676 [libunwind] Unwind through aarch64/Linux sigreturn frame
An AArch64 sigreturn trampoline frame can't currently be described
in a DWARF .eh_frame section, because the AArch64 DWARF spec currently
doesn't define a constant for the PC register. (PC and LR may need to
be restored to different values.)

Instead, use the same technique as libgcc or github.com/libunwind and
detect the sigreturn frame by looking for the sigreturn instructions:

    mov x8, #0x8b
    svc #0x0

If a sigreturn frame is detected, libunwind restores all the GPRs by
assuming that sp points at an rt_sigframe Linux kernel struct. This
behavior is a fallback mode that is only used if there is no ordinary
unwind info for sigreturn.

If libunwind can't find unwind info for a PC, it assumes that the PC is
readable, and would crash if it isn't. This could happen if:
 - The PC points at a function compiled without unwind info, and which
   is part of an execute-only mapping (e.g. using -Wl,--execute-only).
 - The PC is invalid and happens to point to unreadable or unmapped
   memory.

In the tests, ignore a failed dladdr call so that the tests can run on
user-mode qemu for AArch64, which uses a stack-allocated trampoline
instead of a vDSO.

Reviewed By: danielkiss, compnerd, #libunwind

Differential Revision: https://reviews.llvm.org/D90898
2021-01-13 16:38:36 -08:00
Jonas Paulsson ddd03842c3 [SystemZ] Clear Available set in SystemZPostRASchedStrategy::initialize().
This needs to be done in order to not crash with -misched-cutoff.

Fixes https://bugs.llvm.org/show_bug.cgi?id=45928

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D94383
2021-01-13 18:18:27 -06:00
Wei Mi 86341247c4 [NFC] Rename ThinLTOPhase to ThinOrFullLTOPhase and move it from PassBuilder.h
to Pass.h.

In some compiler passes like SampleProfileLoaderPass, we want to know which
LTO/ThinLTO phase the pass is in. Currently the phase is represented in enum
class PassBuilder::ThinLTOPhase, so it is only available in PassBuilder and
it also cannot represent phase in full LTO. The patch extends it to include
full LTO phases and move it from PassBuilder.h to Pass.h, then it is much
easier for PassBuilder to communiate with each pass about current LTO phase.

Differential Revision: https://reviews.llvm.org/D94613
2021-01-13 15:55:40 -08:00
James Player 854f0984f0 Fix llvm::Optional build breaks in MSVC using std::is_trivially_copyable
Current code breaks this version of MSVC due to a mismatch between `std::is_trivially_copyable` and `llvm::is_trivially_copyable` for `std::pair` instantiations.  Hence I was attempting to use `std::is_trivially_copyable` to set `llvm::is_trivially_copyable<T>::value`.

I spent some time root causing an `llvm::Optional` build error on MSVC 16.8.3 related to the change described above:

```
62>C:\src\ocg_llvm\llvm-project\llvm\include\llvm/ADT/BreadthFirstIterator.h(96,12): error C2280: 'llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>> &llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>>::operator =(const llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>> &)': attempting to reference a deleted function (compiling source file C:\src\ocg_llvm\llvm-project\llvm\unittests\ADT\BreadthFirstIteratorTest.cpp)
...
```
The "trivial" specialization of `optional_detail::OptionalStorage` assumes that the value type is trivially copy constructible and trivially copy assignable. The specialization is invoked based on a check of `is_trivially_copyable` alone, which does not imply both `is_trivially_copy_assignable` and `is_trivially_copy_constructible` are true.

[[ https://en.cppreference.com/w/cpp/named_req/TriviallyCopyable | According to the spec ]], a deleted assignment operator does not make `is_trivially_copyable` false. So I think all these properties need to be checked explicitly in order to specialize `OptionalStorage` to the "trivial" version:
```
/// Storage for any type.
template <typename T, bool = std::is_trivially_copy_constructible<T>::value
                          && std::is_trivially_copy_assignable<T>::value>
class OptionalStorage {
```
Above fixed my build break in MSVC, but I think we need to explicitly check `is_trivially_copy_constructible` too since it might be possible the copy constructor is deleted.  Also would be ideal to move over to `std::is_trivially_copyable` instead of the `llvm` namespace verson.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D93510
2021-01-13 15:23:48 -08:00
Craig Topper 7ec8f43659 [SPARC] Fix fp128 load/stores
The generated code for the split fp128 load/stores was missing a small yet important adjustment to the pointer metadata being fed into `getStore` and `getLoad`, making it out of sync with the effective memory address.
This problem often resulted in instructions being scheduled in the wrong order.

I also took this chance to clean up some "wrong" uses of `getAlignment` as done in D77687.

Thanks @jrtc27 for finding the problem and providing a patch.

Patch by LemonBoy and Jessica Clarke(jrtc27)

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94345
2021-01-13 14:59:50 -08:00
Arthur Eubanks 39e6d24237 [NewPM] Only non-trivially loop unswitch at -O3 and for non-optsize functions
This matches the legacy pipeline/pass.

Reviewed By: asbirlea, SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D94559
2021-01-13 14:54:49 -08:00
Jian Cai 82c4153e66 Revert "[AsmParser] make .ascii support spaces as separators"
This reverts commit e0963ae274. The change
breaks some GDB tests. Revert it while we investigate.
2021-01-13 14:38:22 -08:00
wlei 35debdfcac [NFC] Fix build break by a initializer list converting error 2021-01-13 14:28:02 -08:00
Fangrui Song 74a42aedfe [test] Add Clang side tests for -fdebug-info-for-profiling
There is currently a driver test but no test for its effect on linkageName & pass pipeline.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D94381
2021-01-13 14:27:39 -08:00
Michael Jones ea8034ec35 [libc][NFC] change isblank and iscntrl from implicit casting
isblank and iscntrl were casting an int to a char implicitly and this
was throwing errors under Fuchsia. I've added a static cast to resolve
this issue.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D94634
2021-01-13 22:06:56 +00:00
Florian Hahn 6077d55381
[DSE] Add tests with stores of existing values.
This patch pre-commits test cases with dead stores of
existing values for D90328. It also updates a few tests that had such
stores by accident, to preserve the original spirit of those tests.
2021-01-13 21:56:21 +00:00
Nikita Popov f711cb9a8a [FuncAttrs] Add additional willreturn tests (NFC) 2021-01-13 22:33:22 +01:00
Kazu Hirata fb98a1be43 Fix the warnings on unused variables (NFC) 2021-01-13 13:32:40 -08:00
Michael Jones 4cfccd5133 [libc][NFC] add macro for fuchsia to switch test backend to zxtest
This moves utils/UnitTest/Test.[h/cpp] to LibcTest.[h/cpp] and adds a
new Test.h that acts as a switcher so that Fuchsia can use the zxtest
backend for running our tests as part of their build.

FuchsiaTest.h is for including fuchsia's zxtest library and anything
else needed to make the tests work under fuchsia (currently just
undefining the isascii macro for the test).

Downstream users, please fix your build instead of reverting.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D94625
2021-01-13 21:28:02 +00:00
Tim Keith 18278ff1aa [flang] Fix accessibility of USEd name in .mod file
If a module specifies default private accessibility, names that have
been use-associated are private by default. This was not reflected in
.mod files.

Differential Revision: https://reviews.llvm.org/D94602
2021-01-13 12:52:44 -08:00
Florian Hahn 01c3135850
[LTO] Add test for freestanding LTO option.
This patch adds a test for the -lto-freestanding option, similar to
llvm/test/ThinLTO/X86/tli-nobuiltin.ll.
2021-01-13 20:43:22 +00:00
Aart Bik f4f158b2f8 [mlir][sparse] add vectorization strategies to sparse compiler
Similar to the parallelization strategies, the vectorization strategies
provide control on what loops should be vectorize. Unlike the parallel
strategies, only innermost loops are considered, but including reductions,
with the control of vectorizing dense loops only or dense and sparse loops.

The vectorized loops are always controlled by a vector mask to avoid
overrunning the iterations, but subsequent vector operation folding removes
redundant masks and replaces the operations with more efficient counterparts.
Similarly, we will rely on subsequent loop optimizations to further optimize
masking, e.g. using an unconditional full vector loop and scalar cleanup loop.

The current strategy already demonstrates a nice interaction between the
sparse compiler and all prior optimizations that went into the vector dialect.

Ongoing discussion at:
https://llvm.discourse.group/t/mlir-support-for-sparse-tensors/2020/10

Reviewed By: penpornk

Differential Revision: https://reviews.llvm.org/D94551
2021-01-13 11:55:23 -08:00
Jeroen Dobbelaere bb72adcaee [NFC] Use correct ssa.copy spelling when referring to the intrinsic
Split out from D91250. Fixes wrong ssa_copy naming.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D94310
2021-01-13 20:43:14 +01:00
Alexandre Ganea e7a371f9fd [LLD][COFF] Avoid std::vector resizes during type merging
Consistently saves approx. 0.6 sec (out of 18 sec) on a large output (400 MB EXE, 2 GB PDB).

Differential Revision: https://reviews.llvm.org/D94555
2021-01-13 14:35:03 -05:00
Tres Popp 3bd620d450 [mlir] Correct 2 places that result in corrupted conversion rollbacks
This corrects the last 2 issues caught by tests when causing dialect
conversion rollbacks to occur.

Differential Revision: https://reviews.llvm.org/D94623
2021-01-13 20:31:15 +01:00
wlei 33a8466531 [NFC] fix missing SectionName declaration 2021-01-13 11:30:09 -08:00
wlei c681400b25 [CSSPGO][llvm-profgen] Virtual unwinding with pseudo probe
This change extends virtual unwinder to support pseudo probe in llvm-profgen. Please refer https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.

**Implementation**

- Added `ProbeBasedCtxKey` derived from `ContextKey` for sample counter aggregation. As we need string splitting to infer the profile for callee function, string based context introduces more string handling overhead, here we just use probe pointer based context.
- For linear unwinding, as inline context is encoded in each pseudo probe, we don't need to go through each instruction to extract range sharing same inliner. So just record the range for the context.
- For probe based context, we should ignore the top frame probe since it will be extracted from the address range. we defer the extraction in `ProfileGeneration`.
- Added `PseudoProbeProfileGenerator` for pseudo probe based profile generation.
- Some helper function to get pseduo probe info(call probe, inline context) from profiled binary.
- Added regression test for unwinder's output

The pseudo probe based profile generation will be in the upcoming patch.

Test Plan:

ninja & ninja check-llvm

Differential Revision: https://reviews.llvm.org/D92896
2021-01-13 11:02:58 -08:00
wlei 414930b91b [CSSPGO][llvm-profgen] Refactor to unify hashable interface for trace sample and context-sensitive counter
As we plan to support both CSSPGO and AutoFDO for llvm-profgen, we will have different kinds of perf sample and different kinds of sample counter(cs/non-cs, with/without pseudo probe) which both need to do aggregation in hash map.  This change implements the hashable interface(`Hashable`) and the unified base class for them to have better extensibility and reusability.

Currently perf trace sample and sample counter with context implemented this `Hashable` and  the class hierarchy is like:

```
| Hashable
           | PerfSample
                          | HybridSample
                          | LBRSample
           | ContextKey
                          | StringBasedCtxKey
                          | ProbeBasedCtxKey
                          | CallsiteBasedCtxKey
           | ...
```

- Class specifying `Hashable` should implement `getHashCode` and `isEqual`. Here we make `getHashCode` a non-virtual function to avoid vtable overhead, so derived class should calculate and assign the base class's HashCode manually. This also provides the flexibility for calculating the hash code incrementally(like rolling hash) during frame stack unwinding
- `isEqual` is a virtual function, which will have perf overhead. In the future, if we redesign a better hash function, then we can just skip this or switch to non-virtual function.
- Added `PerfSample` and `ContextKey` as base class for perf sample and counter context key, leveraging llvm-style RTTI for this.
- Added `StringBasedCtxKey` class extending  `ContextKey` to use string as context id.
- Refactor `AggregationCounter` to take all kinds of `PerfSample` as key
- Refactor `ContextSampleCounter` to take all kinds of `ContextKey` as key
- Other refactoring work:
 - Create a wrapper class `SampleCounter` to wrap `RangeCounter` and `BranchCounter`
 - Hoist `ContextId` and `FunctionProfile` out of `populateFunctionBodySamples` and `populateFunctionBoundarySamples` to reuse them in ProfileGenerator

Differential Revision: https://reviews.llvm.org/D92584
2021-01-13 11:02:57 -08:00
wlei b3154d11bc [CSSPGO][llvm-profgen] Pseudo probe decoding and disassembling
This change implements pseudo probe decoding and disassembling for llvm-profgen/CSSPGO. Please see https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.

**ELF section format**
Please see the encoding patch(https://reviews.llvm.org/D91878) for more details of the format, just copy the example here:

Two section(`.pseudo_probe_desc` and  `.pseudoprobe` ) is emitted in ELF to support pseudo probe.
The format of `.pseudo_probe_desc` section looks like:

```
.section   .pseudo_probe_desc,"",@progbits
.quad   6309742469962978389  // Func GUID
.quad   4294967295           // Func Hash
.byte   9                    // Length of func name
.ascii  "_Z5funcAi"          // Func name
.quad   7102633082150537521
.quad   138828622701
.byte   12
.ascii  "_Z8funcLeafi"
.quad   446061515086924981
.quad   4294967295
.byte   9
.ascii  "_Z5funcBi"
.quad   -2016976694713209516
.quad   72617220756
.byte   7
.ascii  "_Z3fibi"
```

For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :

```
FUNCTION BODY (one for each outlined function present in the text section)
    GUID (uint64)
        GUID of the function
    NPROBES (ULEB128)
        Number of probes originating from this function.
    NUM_INLINED_FUNCTIONS (ULEB128)
        Number of callees inlined into this function, aka number of
        first-level inlinees
    PROBE RECORDS
        A list of NPROBES entries. Each entry contains:
          INDEX (ULEB128)
          TYPE (uint4)
            0 - block probe, 1 - indirect call, 2 - direct call
          ATTRIBUTE (uint3)
            reserved
          ADDRESS_TYPE (uint1)
            0 - code address, 1 - address delta
          CODE_ADDRESS (uint64 or ULEB128)
            code address or address delta, depending on ADDRESS_TYPE
    INLINED FUNCTION RECORDS
        A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
        callees.  Each record contains:
          INLINE SITE
            GUID of the inlinee (uint64)
            ID of the callsite probe (ULEB128)
          FUNCTION BODY
            A FUNCTION BODY entry describing the inlined function.
```

**Disassembling**
A switch `--show-pseudo-probe` is added to use along with `--show-disassembly` to print disassembly code with pseudo probe directives.

For example:
```
00000000002011a0 <foo2>:
  2011a0: 50                    push   rax
  2011a1: 85 ff                 test   edi,edi
  [Probe]:  FUNC: foo2  Index: 1  Type: Block
  2011a3: 74 02                 je     2011a7 <foo2+0x7>
  [Probe]:  FUNC: foo2  Index: 3  Type: Block
  [Probe]:  FUNC: foo2  Index: 4  Type: Block
  [Probe]:  FUNC: foo   Index: 1  Type: Block  Inlined: @ foo2:6
  2011a5: 58                    pop    rax
  2011a6: c3                    ret
  [Probe]:  FUNC: foo2  Index: 2  Type: Block
  2011a7: bf 01 00 00 00        mov    edi,0x1
  [Probe]:  FUNC: foo2  Index: 5  Type: IndirectCall
  2011ac: ff d6                 call   rsi
  [Probe]:  FUNC: foo2  Index: 4  Type: Block
  2011ae: 58                    pop    rax
  2011af: c3                    ret
```

**Implementation**
- `PseudoProbeDecoder` is added in ProfiledBinary as an infra for the decoding. It decoded the two section and generate two map: `GUIDProbeFunctionMap` stores all the `PseudoProbeFunction` which is the abstraction of a general function. `AddressProbesMap` stores all the pseudo probe info indexed by its address.
- All the inline info is encoded into binary as a trie(`PseudoProbeInlineTree`) and will be constructed from the decoding. Each pseudo probe can get its inline context(`getInlineContext`) by traversing its inline tree node backwards.

Test Plan:
ninja & ninja check-llvm

Differential Revision: https://reviews.llvm.org/D92334
2021-01-13 11:02:57 -08:00
peter klausler 166e5c335c [flang] Do not create HostAssoc symbols in derived type scopes
When needed due to a specification expression in a derived type,
the host association symbols should be created in the surrounding
subprogram's scope instead.

Differential Revision: https://reviews.llvm.org/D94567
2021-01-13 11:01:27 -08:00
Sanjay Patel 123674a816 [SLP] simplify type check for reductions
This is NFC-intended. The 'valid' call allows int/FP/pointers
for other parts of SLP. The difference here is that we can't
reduce pointers.
2021-01-13 13:30:46 -05:00
Krzysztof Parzyszek a2e6506c47 [Hexagon] Improve legalizing of ISD::SETCC result 2021-01-13 12:29:22 -06:00
peter klausler a50bb84ec0 [flang] Fix classification of shape inquiries in specification exprs
In some contexts, including the motivating case of determining whether
the expressions that define the shape of a variable are "constant expressions"
in the sense of the Fortran standard, expression rewriting via Fold()
is not necessary, and should not be required.  The inquiry intrinsics LBOUND,
UBOUND, and SIZE work correctly now in specification expressions and are
classified correctly as being constant expressions (or not).  Getting this right
led to a fair amount of API clean-up as a consequence, including the
folding of shapes and TypeAndShape objects, and new APIs for shapes
that do not fold for those cases where folding isn't needed.  Further,
the symbol-testing predicate APIs in Evaluate/tools.h now all resolve any
associations of their symbols and work transparently on use-, host-, and
construct-association symbols; the tools used to resolve those associations have
been defined and documented more precisely, and their clients adjusted as needed.

Differential Revision: https://reviews.llvm.org/D94561
2021-01-13 10:05:14 -08:00
LLVM GN Syncbot 14f322f074 [gn build] Port 60fda8ebb6 2021-01-13 17:33:32 +00:00
Sam Tebbs 60fda8ebb6 [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch
Blocks can be laid out such that a t2WhileLoopStart branches backwards. This is forbidden by the architecture and so it fails to be converted into a low-overhead loop. This new pass checks for these cases and moves the target block, fixing any fall-through that would then be broken.

Differential Revision: https://reviews.llvm.org/D92385
2021-01-13 17:23:00 +00:00
Simon Pilgrim 993c488ed2 [DAG] visitVECTOR_SHUFFLE - use all_of to check for all-undef shuffle mask. NFCI. 2021-01-13 17:19:41 +00:00
Simon Pilgrim efb6e45d2b [X86][AVX] Add test for another 'reverse HADD' pattern mentioned in PR41813 2021-01-13 17:19:41 +00:00
Simon Pilgrim cbbfc82586 [X86][SSE] canonicalizeShuffleMaskWithHorizOp - simplify shuffle(HOP(HOP(X,Y),HOP(Z,W))) style chains.
See if we can remove the shuffle by resorting a HOP chain so that the HOP args are pre-shuffled.

This initial version just handles (the most common) v4i32/v4f32 hadd/hsub reduction patterns - future work can extend this to v8i16 types plus PACK chains (2f64 HADD/HSUB should already be handled in the half-lane combine code later on).
2021-01-13 17:19:40 +00:00
Jonas Devlieghere 48d2068fb7 [dsymutil] Warn on timestmap mismatch between object file and debug map
This re-lands e5553b9a6a with two small fixes to the tests:

 - Don't touch the source directory in debug-map-parsing.test but
   instead copy everything over in a temporary directory in
   timestamp-mismatch.test.
 - Don't redirect stderr to stdout to avoid the output getting
   intertwined in extern-alias.test.
2021-01-13 09:15:30 -08:00
Andrew Litteken 05b1a15f70 [IROutliner] Adapting to hoisted bitcasts in CodeExtractor
In commit 700d2417d8 the CodeExtractor
was updated so that bitcasts that have lifetime markers that beginning
outside of the region are deduplicated outside the region and are not
used as an output.  This caused a discrepancy in the IROutliner, where
in these cases there were arguments added to the aggregate function
that were not needed causing assertion errors.

The IROutliner queries the CodeExtractor twice to determine the inputs
and outputs, before and after `findAllocas` is called with the same
ValueSet for the outputs causing the duplication. This has been fixed
with a dummy ValueSet for the first call.

However, the additional bitcasts prevent us from using the same
similarity relationships that were previously defined by the
IR Similarity Analysis Pass. In these cases, we check whether the
initial version of the region being analyzed for outlining is still the
same as it was previously.  If it is not, i.e. because of the additional
bitcast instructions from the CodeExtractor, we discard the region.

Reviewers: yroux

Differential Revision: https://reviews.llvm.org/D94303
2021-01-13 11:10:37 -06:00
Sam McCall 0bbc6a6bb6 [clangd] Remove some old CodeCompletion options that are never (un)set. NFC 2021-01-13 18:01:48 +01:00
Utkarsh Saxena a4f3866882 [clangd] Remove "decision-forest-base" experimental flag.
The value of this flag can only be fine tuned by using A/B testing on large
user base.
We do not expect individual users to use and fine tune this flag.

Differential Revision: https://reviews.llvm.org/D94513
2021-01-13 17:54:38 +01:00
Nikita Popov 17863614da [InstCombine] Fold select -> and/or using impliesPoison
We can fold a ? b : false to a & b if is_poison(b) implies that
is_poison(a), at which point we're able to reuse all the usual fold
on ands. In particular, this covers the very common case of
icmp X, C && icmp X, C'. The same applies to ors.

This currently only has an effect if the
-instcombine-unsafe-select-transform=0 option is set.

Differential Revision: https://reviews.llvm.org/D94550
2021-01-13 17:45:40 +01:00
Sanjay Patel e433ca28ec [SLP] add reduction test for FMF; NFC 2021-01-13 11:43:51 -05:00