Commit Graph

370006 Commits

Author SHA1 Message Date
David Green 61bc18de0b [Schedule] Add a MultiHazardRecognizer
This adds a MultiHazardRecognizer and starts to make use of it in the
ARM backend. The idea of the class is to allow multiple independent
hazard recognizers to be added to a single base MultiHazardRecognizer,
allowing them to all work in parallel without requiring them to be
chained into subclasses. They can then be added or not based on cpu or
subtarget features, which will become useful in the ARM backend once
more hazard recognizers are being used for various things.

This also renames ARMHazardRecognizer to ARMHazardRecognizerFPMLx in the
process, to more clearly explain what that recognizer is designed for.

Differential Revision: https://reviews.llvm.org/D72939
2020-10-26 08:06:17 +00:00
Kazushi (Jam) Marukawa 52f03fe115 [VE] Support atomic fence
Support atomic fence instruction and add a regression test.
Add MEMBARRIER pseudo insturction also to use it as a barrier
against to the compiler optimizations.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D90112
2020-10-26 17:03:09 +09:00
Max Kazantsev bfabd7878b Fix broken build after previous commit 2020-10-26 14:55:46 +07:00
Max Kazantsev cdccc82f48 [NFC] Remove unused funciton param 2020-10-26 14:53:22 +07:00
Max Kazantsev 4b5e848bef [NFC] Factor out common code into lambda for further improvement 2020-10-26 14:50:45 +07:00
Max Kazantsev c019099053 [IndVars] Use contextual knowledge when proving trivial conds
No exact example where it would help, but it's a generally a more
powerful way to prove predicates.
2020-10-26 13:48:32 +07:00
Kirill Bobyrev 15f6bad6d7
[clangd] Add dependency on remote index service proto
It requires Index.proto to be built first. Failed builds:
https://github.com/clangd/clangd/runs/1305985916
2020-10-26 07:08:49 +01:00
Christudasan Devadasan 5a061041ec [AMDGPU] Avoid offset register in MUBUF for direct stack object accesses
We use an absolute address for stack objects and
it would be necessary to have a constant 0 for soffset field.

Fixes: SWDEV-228562

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D89234
2020-10-26 11:08:37 +05:30
Craig Topper 82974e0114 [X86] Don't disassemble wbinvd with 0xf2 or 0x66 prefix.
The 0xf3 prefix has been defined as wbnoinvd on Icelake Server. So
the prefix isn't ignored by the CPU. AMD documentation suggests that
wbnoinvd is treated as wbinvd on older processors. Intel documentation
is not clear. Perhaps 0xf2 and 0x66 are treated the same, but its
not documented.

This patch changes TB to PS in the td file so 0xf2 and 0x66 will
be treated as errors. This matches versions of objdump after
wbnoinvd was added.
2020-10-25 20:56:01 -07:00
Liu, Chen3 180548c5c7 [X86] VEX/EVEX prefix doesn't work for inline assembly.
For now, we lost the encoding information if we using inline assembly.
The encoding for the inline assembly will keep default even if we add
the vex/evex prefix.

Differential Revision: https://reviews.llvm.org/D90009
2020-10-26 08:37:45 +08:00
Craig Topper 63ba82ed00 [X86] Use TargetConstant for immediates for VASTART_SAVE_XMM_REGS. 2020-10-25 12:52:56 -07:00
Craig Topper 2ed16aa66f [X86] Use TargetConstant instead of Constant for operands to X86vaarg64. 2020-10-25 12:24:59 -07:00
Sanjay Patel f2c25c7079 [CostModel] remove cost-kind predicate for some vector reduction costs
This is a modified 2nd try of 22d10b8ab4
(reverted by 1c8371692d because it managed
to expose an existing crashing bug that should be fixed by
74ffc823 ).

Original commit message:

This is similar in spirit to 01ea93d85d (memcpy) except that
here the underlying caller assumptions were created for vectorizer
use (throughput) rather than other passes.

That meant targets could have an enormous throughput cost with no
corresponding size, latency, or blended cost increase.
The ARM costs show a small difference between throughput and
size because there's an underlying difference in cmp/sel
costs that is also predicated on cost-kind.

Paraphrasing from the previous commits:
This may not make sense for some callers, but at least now the
costs will be consistently wrong instead of mysteriously wrong.

Targets should provide better overrides if the current modeling
is not accurate.
2020-10-25 15:17:52 -04:00
Sanjay Patel 74ffc823ed [CostModel] fix operand/type accounting for fadd/fmul reductions
I'm not sure if/how this ever worked, but it must not be tested
currently because the basic tests added here were crashing as
noted in the post-review comments for 1c83716 (which reverted
another cost-model fix in 22d10b8ab4).
2020-10-25 15:01:19 -04:00
Nikita Popov ebeef022aa [SCEV] Strenthen nowrap flags after constant folding for mul exprs
Same change as 0dda633317, but for
mul expressions. We want to first fold any constant operans and
then strengthen the nowrap flags, as we can compute more precise
flags at that point.
2020-10-25 19:43:58 +01:00
Aaron Puchert b296c64e64 Thread safety analysis: Nullability improvements in TIL, NFCI
The constructor of Project asserts that the contained ValueDecl is not
null, use that in the ThreadSafetyAnalyzer. In the case of LiteralPtr
it's the other way around.

Also dyn_cast<> is sufficient if we know something isn't null.
2020-10-25 19:37:16 +01:00
Aaron Puchert 5250a03a99 Thread safety analysis: Consider global variables in scope
Instead of just mutex members we also consider mutex globals.
Unsurprisingly they are always in scope. Now the paper [1] says that

> The scope of a class member is assumed to be its enclosing class,
> while the scope of a global variable is the translation unit in
> which it is defined.

But I don't think we should limit this to TUs where a definition is
available - a declaration is enough to acquire the mutex, and if a mutex
is really limited in scope to a translation unit, it should probably be
only declared there.

The previous attempt in 9dcc82f34e was causing false positives because
I wrongly assumed that LiteralPtrs were always globals, which they are
not. This should be fixed now.

[1] https://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/42958.pdf

Fixes PR46354.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D84604
2020-10-25 19:32:26 +01:00
Nikita Popov 1ff313f098 [SCEV] Always constant fold mul expression operands
Establish parity with the handling of add expressions, by always
constant folding mul expression operands before checking the depth
limit (this is a non-recursive simplification). The code was already
unconditionally constant folding the case where all operands were
constants, but was not folding multiple constant operands together
if there were also non-constant operands.

This requires picking out a different demonstration for depth-based
folding differences in the limit-depth.ll test.
2020-10-25 18:50:06 +01:00
Nikita Popov 22a5cde541 [SCEV] Separate out constant folding in mul expr creation
Separate out the code handling constant folding into a separate
block, that is independent of other folds that need a constant
first operand. Also make some minor adjustments to make the
constant folding look nearly identical to the same code in
getAddExpr().

The only reason this change is not strictly NFC is that the
C1*(C2+V) fold is moved below the constant folding, which means
that it now also applies to C1*C2*(C3+V), as it should.
2020-10-25 18:46:50 +01:00
Nikita Popov 0dda633317 [SCEV] Strength nowrap flags after constant folding
We should first try to constant fold the add expression and only
strengthen nowrap flags afterwards. This allows us to determine
stronger flags if e.g. only two operands are left after constant
folding (and thus "guaranteed no wrap region" code applies) or the
resulting operands are non-negative and thus nsw->nuw strengthening
applies.
2020-10-25 18:00:22 +01:00
Nikita Popov c5718253c9 [IndVars] Regenerate test checks (NFC)
Also run the test case through -instnamer.
2020-10-25 17:45:12 +01:00
Sanjay Patel e77ba263fe [InstSimplify] peek through 'not' operand in logic-of-icmps fold
This extends D78430 to solve cases like:
https://llvm.org/PR47858

There are still missed opportunities shown in the tests,
and as noted in the earlier patches, we have related
functionality in InstCombine, so we may want to extend
other folds in a similar way.

A semi-random sampling of test diff proofs in this patch:
https://rise4fun.com/Alive/sS4C
2020-10-25 11:13:30 -04:00
Sanjay Patel 7de2add829 [InstSimplify] add tests for logic-of-cmps with not op; NFC
One variant of this is shown in:
https://llvm.org/PR47858
2020-10-25 11:13:30 -04:00
Melanie Blower 576d436c82 Correct LIT test failure detected on buildbot after mibintc committed rG2e204e23911b: [clang] Enable support for #pragma STDC FENV_ACCESS D87528 2020-10-25 08:10:34 -07:00
Florian Hahn 968aa6b917 [SLP] Add AArch64 tests with vectorizable compare/select patterns.
This patch adds an additional set of tests that can be vectorized
efficiently on AArch64, using CMxx & BFI.
2020-10-25 15:08:30 +00:00
Simon Pilgrim d64ea0f189 Remove superfluous whitespace around if(). NFC. 2020-10-25 14:38:16 +00:00
Melanie Blower 2e204e2391 [clang] Enable support for #pragma STDC FENV_ACCESS
Reviewers: rjmccall, rsmith, sepavloff

Differential Revision: https://reviews.llvm.org/D87528
2020-10-25 06:46:25 -07:00
Simon Pilgrim 3052e474ec [InstCombine] matchBSwapOrBitReversem - recognise or(fshl(),fshl()) bswap patterns.
I'm not certain InstCombinerImpl::matchBSwapOrBitReverse needs to filter the or(op0(),op1()) ops - there are just too many cases that recognizeBSwapOrBitReverseIdiom/collectBitParts handle now (and quickly).
2020-10-25 10:17:45 +00:00
Simon Pilgrim 5e9f172295 [InstCombine] Add test for or(fshl(),fshl()) bswap pattern.
Currently InstCombinerImpl::matchBSwapOrBitReverse won't match starting from funnel shifts.
2020-10-25 10:07:19 +00:00
Richard Smith f81f09ba89 [c++20] For P0732R2: Support string literal operator templates. 2020-10-25 00:34:15 -07:00
Craig Topper a222d832d5 [X86] Use TargetConstant for FPDiff with X86::TC_RETURN.
It's required to be a constant and can never be in a register so
make it explicit.
2020-10-25 00:29:11 -07:00
Martin Storsjö 1c8371692d Revert "[CostModel] remove cost-kind predicate for vector reduction costs"
This reverts commit 22d10b8ab4.

This broke compilation e.g. like this:
$ cat synth.c
*a;
float *b;
c() {
  for (;;) {
    float d = -*b * *a++;
    d -= *--b * *a++;
    d -= *--b * *a;
    d -= *--b * *a;
    e(d);
  }
}
$ clang -target x86_64-linux-gnu -c -O2 -ffast-math synth.c
clang: ../include/llvm/Support/Casting.h:104: static bool llvm::isa_impl
_cl<To, const From*>::doit(const From*) [with To = llvm::PointerType; Fr
om = llvm::Type]: Assertion `Val && "isa<> used on a null pointer"' fail
ed.
2020-10-25 08:47:54 +02:00
Teresa Johnson 13c62ce99a [MemProf] Temporarily disable part of test
Disable the part of this test that started failing only on the
llvm-avr-linux bot after 5c20d7db9f.
Unfortunately, "XFAIL: avr" does not work. Still in the process of
trying to figure out how to debug.
2020-10-24 23:07:34 -07:00
Richard Smith 7b3515880c For P0732R2, P1907R1: ensure that template parameter objects don't refer
to disallowed objects or have non-constant destruction.
2020-10-24 22:11:43 -07:00
Nathan Ridge aaa8b44d19 [clangd] Add a TestWorkspace utility
TestWorkspace allows easily writing tests involving multiple
files that can have inclusion relationships between them.

BackgroundIndexTest.RelationsMultiFile is refactored to use
TestWorkspace, and moved to FileIndexTest as it no longer
depends on BackgroundIndex.

Differential Revision: https://reviews.llvm.org/D89297
2020-10-24 20:15:17 -04:00
Arthur Eubanks c039e83a2c Fix typo SSC -> SCC 2020-10-24 16:26:48 -07:00
Fangrui Song 398b81067c [ELF] Don't crash on R_X86_64_GOTPCRELX for test/binop instructions
While MC did not produce R_X86_64_GOTPCRELX for test/binop instructions
(movl/adcl/addl/andl/...) before the previous commit, this code path has been
exercised by -fno-integrated-as for GNU as since 2016: -no-pie relaxing
may incorrectly access loc[-3] and produce a corrupted instruction.

Simply handle test/binop R_X86_64_GOTPCRELX like R_X86_64_GOTPCREL.
2020-10-24 15:14:17 -07:00
Fangrui Song f04d92af94 [X86] Produce R_X86_64_GOTPCRELX for test/binop instructions (MOV32rm/TEST32rm/...) when -Wa,-mrelax-relocations=yes is enabled
We have been producing R_X86_64_REX_GOTPCRELX (MOV64rm/TEST64rm/...) and
R_X86_64_GOTPCRELX for CALL64m/JMP64m without the REX prefix since 2016 (to be
consistent with GNU as), but not for MOV32rm/TEST32rm/...
2020-10-24 15:14:17 -07:00
Drew Fisher 1e09dbb6a9 [asan] Fix stack-use-after-free checks on non-main thread on Fuchsia
While some platforms call `AsanThread::Init()` from the context of the
thread being started, others (like Fuchsia) call `AsanThread::Init()`
from the context of the thread spawning a child.  Since
`AsyncSignalSafeLazyInitFakeStack` writes to a thread-local, we need to
avoid calling it from the spawning thread on Fuchsia.  Skipping the call
here on Fuchsia is fine; it'll get called from the new thread lazily on first
attempted access.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D89607
2020-10-24 14:29:32 -07:00
Drew Fisher 29480c6c74 [asan][fuchsia] set current thread before reading thread state
When enabling stack use-after-free detection, we discovered that we read
the thread ID on the main thread while it is still set to 2^24-1.

This patch moves our call to AsanThread::Init() out of CreateAsanThread,
so that we can call SetCurrentThread first on the main thread.

Reviewed By: mcgrathr

Differential Revision: https://reviews.llvm.org/D89606
2020-10-24 14:23:09 -07:00
Fangrui Song d5adadb3a5 [AArch64][GlobalISel] Fix -Wunused-variable. NFC 2020-10-24 12:47:11 -07:00
Nico Weber 6f9d84bb26 Revert "hwasan: Disable operator {new,delete} interceptors when interceptors are disabled."
This reverts commit fa66bcf4bc.
Seems to break tests, see https://reviews.llvm.org/D89827#2351930
2020-10-24 15:04:22 -04:00
Sanjay Patel 22d10b8ab4 [CostModel] remove cost-kind predicate for vector reduction costs
This is similar in spirit to 01ea93d85d (memcpy) except that
here the underlying caller assumptions were created for vectorizer
use (throughput) rather than other passes.

That meant targets could have an enormous throughput cost with no
corresponding size, latency, or blended cost increase.
The ARM costs show a small difference between throughput and
size because there's an underlying difference in cmp/sel
costs that is also predicated on cost-kind.

Paraphrasing from the previous commits:
This may not make sense for some callers, but at least now the
costs will be consistently wrong instead of mysteriously wrong.

Targets should provide better overrides if the current modeling
is not accurate.
2020-10-24 13:20:17 -04:00
Benjamin Kramer 39a0d6889d [X86] Add a stub for Intel's alderlake.
No scheduling, no autodetection.
2020-10-24 19:01:22 +02:00
Benjamin Kramer bd2cf96c09 [X86] Add a stub for znver3 based on the little public information there is in AMD's manuals
No scheduling, no autodetection. Just enough so -march=znver3 works.
2020-10-24 19:01:22 +02:00
Benjamin Kramer b8d2b6f6cf Unbreak the clang-interpreter example after 0aec49c853 2020-10-24 19:01:21 +02:00
dfukalov 9068c20965 [AMDGPU][CostModel] Refine cost model for half- and quarter-rate instructions.
1. Throughput and codesize costs estimations was separated and updated.
2. Updated fdiv cost estimation for different cases.
3. Added scalarization processing for types that are treated as !isSimple() to
improve codesize estimation in getArithmeticInstrCost() and
getArithmeticInstrCost(). The code was borrowed from TCK_RecipThroughput path
of base implementation.

Next step is unify scalarization part in base class that is currently works for
TCK_RecipThroughput path only.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D89973
2020-10-24 19:53:08 +03:00
David Green 92205bf122 [ARM] Remove some dead code. NFC 2020-10-24 17:22:49 +01:00
Andrzej Warzynski cbb7f1420b [flang][tests] Fix Python bug in the lit config
Without this change LIT tests for Flang fail with:
```
TypeError: append() takes exactly one argument (2 given)
```
2020-10-24 17:04:25 +01:00
Stefan Gränitz 66abe650ff Reapply "[jitlink][ELF] Add zero-fill blocks for symbols in section SHN_COMMON"
Root cause of the test failure was fixed with:
[JITLink][ELF] PCRel32GOTLoad edge offset can be smaller three

This reverts commit 10b1a61baf.
2020-10-24 16:58:06 +02:00