Commit Graph

374198 Commits

Author SHA1 Message Date
Fangrui Song 9c53b2adc8 [MC] Delete unused declarations
Notes:

* llvm::createAsmStreamer: it has been moved to TargetRegistry.h
* (anon ns)::WasmObjectWriter::updateCustomSectionRelocations: remnant of D46335
* COFFAsmParser::ParseSEHRegisterNumber: remnant of D66625
* llvm::CodeViewContext::isValidCVFileNumber: accidentally added by r279847
2020-12-06 15:36:39 -08:00
Fangrui Song 4701cb41ed [lld] Delete unused declarations
Notes:

* runMSVCLinker: remnant of r338615
* wasm markSymbol: remnant of r374275
* wasm addDataAddressGlobal: accidentally added by r372779
* MachO Writer::createSymtabContents: accidentally added by D76839
2020-12-06 15:26:37 -08:00
Craig Topper 305fcc9122 [LoopIdiomRecognize] Merge a conditional operator with an earlier if and remove an extra temporary variable. NFC
The CountPrev variable was only used to forward a value from
the if statement to the conditional operator under the same
condition.

While there move some variable declarations to their first
assignment.
2020-12-06 15:23:18 -08:00
Fangrui Song 6785ca0124 [llvm-c] Delete unimplemented llvm-c/LinkTimeOptimizer.h
The file was added in 2007 but the functions have never been implemented.
Having the file can only cause confusion to existing C API (llvm-c/lto.h) users.
2020-12-06 15:18:25 -08:00
Fangrui Song 9fe1809f8c [X86] Delete 3 unused declarations 2020-12-06 15:13:39 -08:00
Fangrui Song 2d03c8e2c8 [CodeGen] Delete 4 unused declarations 2020-12-06 15:02:18 -08:00
Fangrui Song 0e0d616fa2 [CodeGen] Delete 15 unused declarations
Notes about a few declarations:

* LiveVariables::RegisterDefIsDead: deleted by r47927
* createForwardControlFlowIntegrityPass, createJumpInstrTablesPass: deleted by r230780
* RegScavenger::setLiveInsUsed: deleted by r292543
* ScheduleDAGInstrs::{toggleKillFlag,startBlockForKills}: deleted by r304055
* Localizer::shouldLocalize: remnant of D75207
* DwarfDebug::addSectionLabel: deleted by r373273
2020-12-06 14:55:04 -08:00
Fangrui Song a2f922140f [TableGen] Delete 11 unused declarations 2020-12-06 13:21:07 -08:00
Fangrui Song 2832f3528c [Transforms] Delete unused declarations from NewGVN/CoroSplit/ValueMapper 2020-12-06 13:04:01 -08:00
Florian Hahn f19876c536 [ConstraintElimination] Bail out if system gets too big.
For some inputs, the constraint system can grow quite large during
solving, because it replaces complex constraints with one or more
simpler constraints. This adds a cut-off to avoid compile-time explosion
on problematic inputs.
2020-12-06 20:19:15 +00:00
LLVM GN Syncbot d1c14dd0fc [gn build] Port 6b989a1710 2020-12-06 20:12:22 +00:00
Wenlei He 6b989a1710 [CSSPGO] Infrastructure for context-sensitive Sample PGO and Inlining
This change adds the context-senstive sample PGO infracture described in CSSPGO RFC (https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s). It introduced an abstraction between input profile and profile loader that queries input profile for functions. Specifically, there's now the notion of base profile and context profile, and they are managed by the new SampleContextTracker for adjusting and merging profiles based on inline decisions. It works with top-down profiled guided inliner in profile loader (https://reviews.llvm.org/D70655) for better inlining with specialization and better post-inline profile fidelity. In the future, we can also expose this infrastructure to CGSCC inliner in order for it to take advantage of context-sensitive profile. This change is the consumption part of context-sensitive profile (The generation part is in this stack: https://reviews.llvm.org/D89707). We've seen good results internally in conjunction with Pseudo-probe (https://reviews.llvm.org/D86193). Pacthes for integration with Pseudo-probe coming up soon.

Currently the new infrastructure kick in when input profile contains the new context-sensitive profile; otherwise it's no-op and does not affect existing AutoFDO.

**Interface**

There're two sets of interfaces for query and tracking respectively exposed from SampleContextTracker. For query, now instead of simply getting a profile from input for a function, we can explicitly query base profile or context profile for given call path of a function. For tracking, there're separate APIs for marking context profile as inlined, or promoting and merging not inlined context profile.

- Query base profile (`getBaseSamplesFor`)
Base profile is the merged synthetic profile for function's CFG profile from any outstanding (not inlined) context. We can query base profile by function.

- Query context profile (`getContextSamplesFor`)
Context profile is a function's CFG profile for a given calling context. We can query context profile by context string.

- Track inlined context profile (`markContextSamplesInlined`)
When a function is inlined for given calling context, we need to mark the context profile for that context as inlined. This is to make sure we don't include inlined context profile when synthesizing base profile for that inlined function.

- Track not-inlined context profile (`promoteMergeContextSamplesTree`)
When a function is not inlined for given calling context, we need to promote the context profile tree so the not inlined context becomes top-level context. This preserve the sub-context under that function so later inline decision for that not inlined function will still have context profile for its call tree. Note that profile will be merged if needed when promoting a context profile tree if any of the node already exists at its promoted destination.

**Implementation**

Implementation-wise, `SampleContext` is created as abstraction for context. Currently it's a string for call path, and we can later optimize it to something more efficient, e.g. context id. Each `SampleContext` also has a `ContextState` indicating whether it's raw context profile from input, whether it's inlined or merged, whether it's synthetic profile created by compiler. Each `FunctionSamples` now has a `SampleContext` that tells whether it's base profile or context profile, and for context profile what is the context and state.

On top of the above context representation, a custom trie tree is implemented to track and manager context profiles. Specifically, `SampleContextTracker` is implemented that encapsulates a trie tree with `ContextTireNode` as node. Each node of the trie tree represents a frame in calling context, thus the path from root to a node represents a valid calling context. We also track `FunctionSamples` for each node, so this trie tree can serve efficient query for context profile. Accordingly, context profile tree promotion now becomes moving a subtree to be under the root of entire tree, and merge nodes for subtree if this move encounters existing nodes.

**Integration**

`SampleContextTracker` is now also integrated with AutoFDO, `SampleProfileReader` and `SampleProfileLoader`. When we detected input profile contains context-sensitive profile, `SampleContextTracker` will be used to track profiles, and all profile query will go to `SampleContextTracker` instead of `SampleProfileReader` automatically. Tracking APIs are called automatically for each inline decision from `SampleProfileLoader`.

Differential Revision: https://reviews.llvm.org/D90125
2020-12-06 11:49:18 -08:00
Fangrui Song 140808768d [test] Fix asan/TestCases/Linux/globals-gc-sections-lld.cpp with -fsanitize-address-globals-dead-stripping
r302591 dropped -fsanitize-address-globals-dead-stripping for ELF platforms
(to work around a gold<2.27 bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19002)

Upgrade REQUIRES: from lto (COMPILER_RT_TEST_USE_LLD (set by Android, but rarely used elsewhere)) to lto-available.
2020-12-06 11:11:15 -08:00
Fangrui Song dde44f488c [test] Fix asan/TestCases/Posix/lto-constmerge-odr.cpp when 'binutils_lto' is avaiable
If COMPILER_RT_TEST_USE_LLD is not set, config.use_lld will be False.
However, if feature 'binutils_lto' is available, lto_supported can still be True,
but config.target_cflags will not get -fuse-ld=lld from config.lto_flags

As a result, we may use clang -flto with system 'ld' which may not support the bitcode file, e.g.

  ld: error: /tmp/lto-constmerge-odr-44a1ee.o: Unknown attribute kind (70) (Producer: 'LLVM12.0.0git' Reader: 'LLVM 12.0.0git')
  // The system ld+LLVMgold.so do not support ATTR_KIND_MUSTPROGRESS (70).

Just require lld-available and add -fuse-ld=lld.
2020-12-06 10:31:40 -08:00
Kazu Hirata ddb002d7c7 [InstCombine] Remove replacePointer (NFC)
The declaration was introduced on Feb 10, 2017 in commit
ba01ed00fe without a corresponding
definition.
2020-12-06 10:24:08 -08:00
Kazu Hirata 68de75ec55 [Mips] Use llvm::is_contained (NFC) 2020-12-06 10:12:55 -08:00
Simon Pilgrim 0101fb73de [X86] Fold MOVMSK(ICMP_SGT(X,-1)) -> NOT(MOVMSK(X)))
Noticed while triaging PR37506
2020-12-06 17:56:41 +00:00
Simon Pilgrim d6941a1979 [X86] Add tests for missing MOVMSK(ICMP_SGT(X,-1)) -> NOT(MOVMSK(X))) fold
Noticed while triaging PR37506
2020-12-06 17:48:27 +00:00
Layton Kifer ac522f8700 [DAGCombiner] Fold (sext (not i1 x)) -> (add (zext i1 x), -1)
Move fold of (sext (not i1 x)) -> (add (zext i1 x), -1) from X86 to DAGCombiner to improve codegen on other targets.

Differential Revision: https://reviews.llvm.org/D91589
2020-12-06 11:52:10 -05:00
Paul C. Anagnostopoulos 0b3e393d6c [TableGen] [CodeGenTarget] Cache the target's instruction namespace.
Differential Revision: https://reviews.llvm.org/D92722
2020-12-06 11:08:30 -05:00
Marek Kurdej e2279c2350 [libc++] [docs] Mark P1865 as complete since 11.0 as it was implemented together with P1135. Fix synopses in <barrier> and <latch>.
It was implemented in commit 54fa9ecd30 ([libc++] Implementation of C++20's P1135R6 for libcxx).
2020-12-06 15:36:52 +01:00
Sanjay Patel 94f6d365e4 [InstCombine] avoid crash on phi with unreachable incoming block (PR48369) 2020-12-06 09:31:47 -05:00
Marek Kurdej f6326736ba [libc++] [LWG3374] Mark `to_address(const Ptr& p)` overload `constexpr`.
Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D92659
2020-12-06 15:26:26 +01:00
Simon Pilgrim db900995ed [CostModel][X86] getGatherScatterOpCost - use default implementation for alt costkinds
Noticed while looking at D92701 - we only really handle TCK_RecipThroughput gather/scatter costs - for now drop back to the default implementation for non-legal gathers/scatters.
2020-12-06 14:08:26 +00:00
Jon Chesterfield e1b8e8a1f4 [libomptarget][amdgpu] Skip device_State allocation when using bss global 2020-12-06 12:13:56 +00:00
Nikita Popov 5e69e2ebad [BasicAA] Migrate "same base pointer" logic to decomposed GEPs
BasicAA has some special bit of logic for "same base pointer" GEPs
that performs a structural comparison: It only looks at two GEPs
with the same base (as opposed to two GEP chains with a MustAlias
base) and compares their indexes in a limited way. I generalized
part of this code in D91027, and this patch merges the remainder
into the normal decomposed GEP logic.

What this code ultimately wants to do is to determine that
gep %base, %idx1 and gep %base, %idx2 don't alias if %idx1 != %idx2,
and the access size fits within the stride.

We can express this in terms of a decomposed GEP expression with
two indexes scale*%idx1 + -scale*%idx2 where %idx1 != %idx2, and
some appropriate checks for sizes and offsets.

This makes the reasoning slightly more powerful, and more
importantly brings all the GEP logic under a common umbrella.

Differential Revision: https://reviews.llvm.org/D92723
2020-12-06 10:27:35 +01:00
Fangrui Song 467b669915 [TargetMachine] Delete asan workaround
687b83ceab has fixed the X86FastISel bug.
We can revert the workaround now. Actually, the commit introduced a
bug that ppc64 should be excluded.
2020-12-06 00:33:11 -08:00
Fangrui Song 687b83ceab [X86FastISel] Fix MO_GOTPCREL GlobalValue reference in static relocation model
This fixes the bug referenced by 5582a79876
which was exposed by 961f31d8ad.

With this change, `movq src@GOTPCREL, %rcx` => `movq src@GOTPCREL(%rip), %rcx`
2020-12-05 23:13:28 -08:00
Fangrui Song a4cadc2df9 [TargetMachine] Don't imply dso_local for memprof in static relocation model
The workaround is no longer needed with my previous commit to MemProfiler.cpp
2020-12-05 21:39:03 -08:00
Fangrui Song 204d0d51b3 [MemProf] Make __memprof_shadow_memory_dynamic_address dso_local in static relocation model
The x86-64 backend currently has a bug which uses a wrong register when for the GOTPCREL reference.
The program will crash without the dso_local specifier.
2020-12-05 21:36:31 -08:00
Vitaly Buka 1f21f6d6a4 [NFC][CodeGen] Simplify SanitizeDtorMembers::Emit 2020-12-05 21:11:27 -08:00
Vitaly Buka 19e7741fef [TargetMachine] Set dso_local for memprof
Similar to 5582a79876
2020-12-05 21:11:04 -08:00
Lang Hames 5bc9c858e3 [ORC] Fix missing forward of Allow filter in TPCDynamicLibrarySearchGenerator. 2020-12-06 15:42:45 +11:00
Craig Topper 5fc8f90f0a [RISCV] Replace a custom SDTypeProfile with SDTIntBinOp which should be sufficient here.
On the surface this would be slightly less optimal for the isel
table, but due to a tablegen issue with HW mode this ends up
generating a smaller isel table.
2020-12-05 20:18:22 -08:00
Fangrui Song b00f345acd [asan][test] Fix odr-vtable.cpp 2020-12-05 19:30:41 -08:00
Fangrui Song 5582a79876 [TargetMachine] Set dso_local if asan is detected
AddressSanitizer instrumentation does not set dso_local on non-thread-local
global variables in -fno-pic and it seems to rely on implied dso_local to work.
Add a hack until we have fixed AddressSanitizer to call setDSOLocal() as
appropriate.

Thanks to Vitaly Buka for reporting the issue and suggesting the way to detect asan.
2020-12-05 17:51:10 -08:00
Jonas Devlieghere ee607ed5c3 [debugserver] Call posix_spawnattr_setarchpref_np throught the fn ptr.
Fourth time is the charm? Of course all of these issues don't show up
when the function is available...
2020-12-05 17:38:42 -08:00
Vitaly Buka 452eddf30b [NFC][CodeGen] Add sanitize-dtor-zero-size-field test
The test demonstrates invalid behaviour which will be fixed soon.
2020-12-05 16:39:48 -08:00
Kazu Hirata 5121400e71 [ConstantHoisting] Remove unused declaration optimizeConstants (NFC)
The function was renamed to runImpl on Jul 2, 2016 in commit
071d8306b0, but the old declaration has
remained since.
2020-12-05 16:22:12 -08:00
Philip Reames 8f076291be Add recursive decomposition reasoning to isKnownNonEqual
The basic idea is that by looking through operand instructions which don't change the equality result that we can push the existing known bits comparison down past instructions which would obscure them.

We have analogous handling in InstSimplify for most - though weirdly not all - of these cases starting from an icmp root. It's a bit unfortunate to duplicate logic, but since my actual goal is to extend BasicAA, the icmp logic doesn't help. (And just makes it hard to test here.)  The BasicAA change will be posted separately for review.

Differential Revision: https://reviews.llvm.org/D92698
2020-12-05 15:58:19 -08:00
Fangrui Song 109e70d357 [TargetMachine] Drop implied dso_local for an edge case (extern_weak + non-pic + hidden)
This does not deserve special handling. The code should be added to Clang
instead if deemed useful. With this simplification, we can additionally delete
the PIC extern_weak special case.
2020-12-05 15:52:33 -08:00
Kazu Hirata a553ac9791 [CodeGen] llvm::erase_if (NFC) 2020-12-05 15:44:40 -08:00
Aditya Kumar c4e327a960 Remove memory allocation with string
Differential Revision: https://reviews.llvm.org/D92506
2020-12-05 15:14:44 -08:00
Fangrui Song 930b3398c7 [TargetMachine] Clean up TargetMachine::shouldAssumeDSOLocal after x86-32 specific hack is moved to X86Subtarget
With my previous commit, X86Subtarget::classifyGlobalReference has learned to
use MO_NO_FLAG for 32-bit ELF -fno-pic code, the x86-32 special case in
TargetMachine::shouldAssumeDSOLocal can be removed. Since we no longer imply
dso_local for function declarations, we can drop the ppc64 special case as well.

This is NFC in terms of Clang emitted assembly.
2020-12-05 15:13:42 -08:00
Fangrui Song a084c0388e [TargetMachine] Don't imply dso_local on function declarations in Reloc::Static model for ELF/wasm
clang/lib/CodeGen/CodeGenModule sets dso_local on applicable function declarations,
we don't need to duplicate the work in TargetMachine:shouldAssumeDSOLocal.
(Actually the long-term goal (started by r324535) is to drop TargetMachine::shouldAssumeDSOLocal.)

By not implying dso_local, we will respect dso_local/dso_preemptable specifiers
set by the frontend. This allows the proposed -fno-direct-access-external-data
option to work with -fno-pic and prevent a canonical PLT entry (SHN_UNDEF with non-zero st_value)
when taking the address of a function symbol.

This patch should be NFC in terms of the Clang emitted assembly because the case
we don't set dso_local is a case Clang sets dso_local. However, some tests don't
set dso_local on some function declarations and expose some differences. Most
tests have been fixed to be more robust in the previous commit.
2020-12-05 14:54:37 -08:00
Fangrui Song 6b6c3aaeac [test] Add explicit dso_local to function declarations in static relocation model tests
They are currently implicit because TargetMachine::shouldAssumeDSOLocal implies
dso_local.

For such function declarations, clang -fno-pic emits the dso_local specifier.
Adding explicit dso_local makes these tests align with the clang behavior and
helps implementing an option to use GOT indirection when taking the address of a
function symbol in -fno-pic (to avoid a canonical PLT entry (SHN_UNDEF with
non-zero st_value)).
2020-12-05 14:54:37 -08:00
Philip Reames bfda69416c [BasicAA] Fix a bug with relational reasoning across iterations
Due to the recursion through phis basicaa does, the code needs to be extremely careful not to reason about equality between values which might represent distinct iterations. I'm generally skeptical of the correctness of the whole scheme, but this particular patch fixes one particular instance which is demonstrateable incorrect.

Interestingly, this appears to be the second attempted fix for the same issue. The former fix is incomplete and doesn't address the actual issue.

Differential Revision: https://reviews.llvm.org/D92694
2020-12-05 14:10:21 -08:00
Jonas Devlieghere 13ee00d0c9 [debugserver] Use dlsym for posix_spawnattr_setarchpref_np
The @available check did not work as I thought it did. Use good old
dlsym instead.
2020-12-05 14:06:45 -08:00
Fangrui Song 37f0c8df47 [X86] Emit @PLT for x86-64 and keep unadorned symbols for x86-32
This essentially reverts the x86-64 side effect of r327198.

For x86-32, @PLT (R_386_PLT32) is not suitable in -fno-pic mode so the
code forces MO_NO_FLAG (like a forced dso_local) (https://bugs.llvm.org//show_bug.cgi?id=36674#c6).

For x86-64, both `call/jmp foo` and `call/jmp foo@PLT` emit R_X86_64_PLT32
(https://sourceware.org/bugzilla/show_bug.cgi?id=22791) so there is no
difference using @PLT. Using @PLT is actually favorable because this drops
a difference with -fpie/-fpic code and makes it possible to avoid a canonical
PLT entry when taking the address of an undefined function symbol.
2020-12-05 13:17:47 -08:00
Chris Sears 9737c128f1 [llvmbuildectomy] removed vestigial LLVMBuild.txt files
LLVMBuild has been removed from the build system. However, three LLVMBuild.txt
files remain in the tree. This patch simply removes them.

llvm/lib/ExecutionEngine/Orc/TargetProcess/LLVMBuild.txt
llvm/tools/llvm-jitlink/llvm-jitlink-executor/LLVMBuild.txt
llvm/tools/llvm-profgen/LLVMBuild.txt

Differential Revision: https://reviews.llvm.org/D92693
2020-12-05 22:00:22 +01:00