With this change, InstrBuilder emits an error if the MCInst sequence contains an
instruction with a variadic opcode, and a non-zero number of variadic operands.
Currently we don't know how to correctly analyze variadic opcodes. The problem
with variadic operands is that there is no information for them in the opcode
descriptor (i.e. MCInstrDesc). That means, we don't know which variadic operands
are defs, and which are uses.
In future, we could try to conservatively assume that any extra register
operands is both a register use and a register definition.
This patch fixes a subtle bug in the evaluation of read/write operands for ARM
VLD1 with implicit index update. Added test vld1-index-update.s
llvm-svn: 347503
...and use them to avoid creating obviously undef values as
discussed in the post-commit thread for r347478.
The diffs in vector div/rem show that we were missing real
optimizations by creating bogus shift nodes.
llvm-svn: 347502
I'm not sure if this actually preserves the original intent
of this test, but if we leave it as-is, the -1 (oversized)
shift should be folded to undef and allow deleting half
of the output.
llvm-svn: 347501
The test fails with a local modification to
clang-tidy/ClangTidyDiagnosticConsumer.cpp to include fixes into the key when
deduplicating the warnings.
llvm-svn: 347495
In ARMOperand::print:
- Print human-readable register names, instead of numbers.
- Print the correct names for IT condition masks (these were in the wrong order
before).
- Print all parts of memory operands, not just the base register.
This makes the output of llvm-mc -show-inst-operands more readable.
Differential revision: https://reviews.llvm.org/D54850
llvm-svn: 347494
RetireControlUnitStatistics now reports extra information about the ROB and the
avg/maximum number of entries consumed over the entire simulation.
Example:
Retire Control Unit - number of cycles where we saw N instructions retired:
[# retired], [# cycles]
0, 109 (17.9%)
1, 102 (16.7%)
2, 399 (65.4%)
Total ROB Entries: 64
Max Used ROB Entries: 35 ( 54.7% )
Average Used ROB Entries per cy: 32 ( 50.0% )
Documentation in llvm/docs/CommandGuide/llvmn-mca.rst has been updated to
reflect this change.
llvm-svn: 347493
I am working on making FileCheck stricter (in D54769 and D53710) so that it
issues diagnostics when there's something wrong with tests.
This is a cleanup for dangling prefixes in the ARM codegen tests, e.g.:
--check-prefixes=A,B
where A occurs in the check file, but B doesn't. This can be innocent if A does
all the required checking, but can also be a bug in that test if it results in
the test actually not checking anything (if A for example only checks a common
label). Test CodeGen/ARM/smml.ll is such an example.
Differential Revision: https://reviews.llvm.org/D54842
llvm-svn: 347487
When removing edges, we also update Phi inputs and may end up removing
a Phi if it has only one input. We should not do it for edges that leave the current
loop because these Phis are LCSSA Phis and need to be preserved.
Thanks @dmgreen for finding this!
Differential Revision: https://reviews.llvm.org/D54841
llvm-svn: 347484
This code takes a truncate, fp_to_int, or int_to_fp with a legal result type and an input type that needs to be split and enlarges the elements in the result type before doing the split. Then inserts a follow up truncate or fp_round after concatenating the two halves back together.
But if the input type of the original op is being split on its way to ultimately being scalarized we're just going to end up building a vector from scalars and then truncating or rounding it in the vector register. Seems kind of silly to enlarge the result element type of the operation only to end up with scalar code and then building a vector with large elements only to make the elements smaller again in the vector register. Seems better to just try to get away producing smaller result types in the scalarized code.
The X86 test case that changes is a pretty contrived test case that exists because of a bug we used to have in our AVG matching code. I think the code is better now, but its not realistic anyway.
llvm-svn: 347482
All of STB_GLOBAL/STB_WEAK/STB_GNU_UNIQUE are treated as export symbols, see:
glibc/elf/dl-lookup.c:do_lookup_x
musl/ldso/dynlink.c OK_BINDS
Though ld.so does not read binding, the currently used STV_DEFAULT or STV_PROTECTED is a good emulation of linker behavior.
llvm-svn: 347481
SplitVecOp_TruncateHelper tries to introduce a multilevel truncate to avoid scalarization. But if splitting the result type would still be a legal type we don't need to do that.
The comment block at the top of the function implied that this was already implemented. I looked back through the history and it doesn't look to have ever been checked.
llvm-svn: 347479
We fail to canonicalize IR this way (prefer 'not' ops to arbitrary 'xor'),
but that would not matter without this patch because DAGCombiner was
reversing that transform. I think we need this transform in the backend
regardless of what happens in IR to catch cases where the shift-xor
is formed late from GEP or other ops.
https://rise4fun.com/Alive/NC1
Name: shl
Pre: (-1 << C2) == C1
%shl = shl i8 %x, C2
%r = xor i8 %shl, C1
=>
%not = xor i8 %x, -1
%r = shl i8 %not, C2
Name: shr
Pre: (-1 u>> C2) == C1
%sh = lshr i8 %x, C2
%r = xor i8 %sh, C1
=>
%not = xor i8 %x, -1
%r = lshr i8 %not, C2
https://bugs.llvm.org/show_bug.cgi?id=39657
llvm-svn: 347478
The test was reverted because it failed on
llvm-clang-x86_64-expensive-checks-win builder, and that was because
-DEXPENSIVE_CHECKS adds randomness to llvm::sort(), affecting the order of
relocation table entries.
Modified the test to not have two relocations at the same offset.
llvm-svn: 347476
This is a revert of r347421, except I'm using the with_system_cxx_lib
lit feature instead of availability to mark the test as unsupported
(because the problem is a bug in the dylib itself). In r347421, I said
I wasn't able to reproduce the issue and that's why I was removing it:
this was because I ran lit slightly wrong. The problem mentioned really
exists.
llvm-svn: 347475
We used to print a Python list corresponding to the command. It is more
useful to print the joined string so it can be copy/pasted directly when
a test fails.
llvm-svn: 347471
The test I'm adding passes without the change due to the deduplication logic in
ClangTidyDiagnosticConsumer::take(). However this bug manifests in our internal
integration with clang-tidy.
I've verified the fix by locally changing LessClangTidyError to consider
replacements.
llvm-svn: 347470
Also, try to minimize the number of queries to the memory queues to speedup the
analysis.
On average, this change gives a small 2% speedup. For memcpy-like kernels, the
speedup is up to 5.5%.
llvm-svn: 347469
Summary:
Previously, removeDoc followed by an addDoc to TUScheduler resulted in
racy diagnostic responses, i.e. the old dianostics could be delivered
to the client after the new ones by TUScheduler.
To workaround this, we tracked a version number in ClangdServer and
discarded stale diagnostics. After this commit, the TUScheduler will
stop delivering diagnostics for removed files and the workaround in
ClangdServer is not required anymore.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: javed.absar, ioeric, MaskRay, jkorous, arphaman, jfb, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D54829
llvm-svn: 347468
Summary:
Instead of passing around a list of supported URI schemes in clangd, we
expose an interface to convert a path to URI using any compatible scheme
that has been registered. It favors customized schemes and falls
back to "file" when no other scheme works.
Changes in this patch are:
- URI::create(AbsPath, URISchemes) -> URI::create(AbsPath). The new API finds a
compatible scheme from the registry.
- Remove URISchemes option everywhere (ClangdServer, SymbolCollecter, FileIndex etc).
- Unit tests will use "unittest" by default.
- Move "test" scheme from ClangdLSPServer to ClangdMain.cpp, and only
register the test scheme when lit-test or enable-lit-scheme is set.
(The new flag is added to make lit protocol.test work; I wonder if there
is alternative here.)
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D54800
llvm-svn: 347467
Summary:
The full path of the input header depends on the execution environment
and may result in different behavior (e.g. when different URI schemes are used).
Reviewers: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D54833
llvm-svn: 347466
Summary:
r346756 refined clang-format to not treat the `[` in `asm (...: [] ..)` as an
ObjCExpr. However that's not enough, as we might have a comma-separated list of
such clobbers as in the newly added test.
This updates the detection to instead look at the Line's first token being `asm`
and not mark `[`-s as ObjCExprs in this case.
Reviewers: djasper, benhamilton
Reviewed By: djasper, benhamilton
Subscribers: benhamilton, cfe-commits
Differential Revision: https://reviews.llvm.org/D54795
llvm-svn: 347465
This avoids a heap allocation most of the times.
This patch gives a small but consistent 3% speedup on a release build (up to ~5%
on a debug build).
llvm-svn: 347464
This patch fixes an invalid memory read introduced by r346487.
Before this patch, partial register write had to query the latency of the
dependent full register write by calling a method on the full write descriptor.
However, if the full write is from an already retired instruction, chances are
that the EntryStage already reclaimed its memory.
In some parial register write tests, valgrind was reporting an invalid
memory read.
This change fixes the invalid memory access problem. Writes are now responsible
for tracking dependent partial register writes, and notify them in the event of
instruction issued.
That means, partial register writes no longer need to query their associated
full write to check when they are ready to execute.
Added test X86/BtVer2/partial-reg-update-7.s
llvm-svn: 347459
A consequence of r347274 is that SCALAR_TO_VECTOR can be converted into
BUILD_VECTOR by SimplifyDemandedBits, but LowerBUILD_VECTOR can turn
BUILD_VECTOR into SCALAR_TO_VECTOR so we get an infinite loop.
Fix this by making LowerBUILD_VECTOR not do this transformation for those
vectors that would get transformed back, i.e. BUILD_VECTOR of a single-element
constant vector. Doing that means we get a DUP, which we then need to recognise
in ISel as a copy.
llvm-svn: 347456
Now it returns Symbol. This should be NFC that
avoids doing cast at the caller's sides.
Differential revision: https://reviews.llvm.org/D54627
llvm-svn: 347455