This teach simplifyDemandedBits to handle constant splat vector shifts.
This required changing some uses of getZExtValue to getLimitedValue since we can't rely on legalization using getShiftAmountTy for the shift amount.
I believe there may have been a bug in the ((X << C1) >>u ShAmt) handling where we didn't check if the inner shift was too large. I've fixed that here.
I had to add new patterns to ARM because the zext/sext the patterns were trying to look for got turned into an any_extend with this patch. Happy to split that out too, but not sure how to test without this change.
Differential Revision: https://reviews.llvm.org/D37665
llvm-svn: 314139
Add two callbacks to MachineEvaluator, so that specific implementations
can specify more details about register classes:
- composeWithSubRegIndex(RC,Idx), to provide the register class for a
register from RC used in conjunction with a subregister index Idx.
- getPhysRegBitWidth(Reg), to provide the size in bits of the given
physical register.
llvm-svn: 314136
This replaces the large number of patterns that handle every possible case of zeroing after a masked compare with a few simpler patterns that use a predicate to check for a masked compare producer.
This is similar to what we do for detecting free GR32->GR64 zero extends and free xmm->ymm/zmm zero extends.
This shrinks the isel table from ~590k to ~531k. This is a roughly 10% reduction in size.
Differential Revision: https://reviews.llvm.org/D38217
llvm-svn: 314133
Normal customer devices won't be able to run these tests, we're hoping to get
a public facing bot set up at some point. Both devices pass the testsuite without
any errors or failures.
I have seen some instability with the armv7 test runs, I may submit additional patches
to address this. arm64 looks good.
I'll be watching the bots for the rest of today; if any problems are introduced by
this patch I'll revert it - if anyone sees a problem with their bot that I don't
see, please do the same. I know it's a rather large patch.
One change I had to make specifically for iOS devices was that debugserver can't
create files. There were several tests that launch the inferior process redirecting
its output to a file, then they retrieve the file. They were not trying to test
file redirection in these tests, so I rewrote those to write their output to a file
directly.
llvm-svn: 314132
Using TCP sockets is insecure against local attackers, and possibly
against remote attackers too (some vulnerabilities may allow tricking a
browser to make a request to localhost). Use socketpair (which is immune
to such attacks) on all Unix platforms.
Patch by Demi Marie Obenour < demiobenour@gmail.com >
Differential Revision: https://reviews.llvm.org/D33213
llvm-svn: 314127
Since now SCEV can handle 'urem', an 'urem' is a better canonical form than an 'srem' because it has well-defined behavior
This is a follow up of D34598
Differential Revision: https://reviews.llvm.org/D38072
llvm-svn: 314125
The transform to convert an extract-of-a-select-of-vectors was added at:
rL194013
And a question about the validity of this transform was raised in the review:
https://reviews.llvm.org/D1539:
...but not answered AFAICT>
Most of the motivating cases in that patch are now handled by other combines. These are the tests that were added with
the original commit, but they are not regressing even after we remove the transform in this patch.
The diffs we see after removing this transform cause us to avoid increasing the instruction count, so we don't want to do
those transforms as canonicalizations.
The motivation for not turning a vector-select-of-vectors into a scalar operation is shown in PR33301:
https://bugs.llvm.org/show_bug.cgi?id=33301
...in those cases, we'll get vector ops with this patch rather than the vector/scalar mix that we currently see.
Differential Revision: https://reviews.llvm.org/D38006
llvm-svn: 314117
Summary:
This code iterates the 'Orders' vector in parallel with the DbgValue
list, emitting all DBG_VALUEs that occurred between the last IR order
insertion point and the next insertion point. This assumes the
SDDbgValue list is sorted in IR order, which it usually is. However, it
is not sorted when a node with a debug value is replaced with another
one. When this happens, TransferDbgValues is called, and the new value
is added to the end of the list.
The problem can be solved by stably sorting the list by IR order.
Reviewers: aprantl, Ka-Ka
Reviewed By: aprantl
Subscribers: MatzeB, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D38197
llvm-svn: 314114
Summary:
Following D38139, we now consolidate the TSD definition, merging the shared
TSD definition with the exclusive TSD definition. We introduce a boolean set
at initializaton denoting the need for the TSD to be unlocked or not. This
adds some unused members to the exclusive TSD, but increases consistency and
reduces the definitions fragmentation.
We remove the fallback mechanism from `scudo_allocator.cpp` and add a fallback
TSD in the non-shared version. Since the shared version doesn't require one,
this makes overall more sense.
There are a couple of additional cosmetic changes: removing the header guards
from the remaining `.inc` files, added error string to a `CHECK`.
Question to reviewers: I thought about friending `getTSDAndLock` in `ScudoTSD`
so that the `FallbackTSD` could `Mutex.Lock()` directly instead of `lock()`
which involved zeroing out the `Precedence`, which is unused otherwise. Is it
worth doing?
Reviewers: alekseyshl, dvyukov, kcc
Reviewed By: dvyukov
Subscribers: srhines, llvm-commits
Differential Revision: https://reviews.llvm.org/D38183
llvm-svn: 314110
This patch expands the support of lowerInterleavedStore to 8x8i stride 4.
LLVM creates suboptimal shuffle code-gen for AVX2.
In overall, this patch is a specific fix for the pattern (Strid=4 VF=8) and we plan to include more patterns in the future.
The patch goal is to optimize the following sequence:
At the end of the computation, we have xmm2, xmm0, xmm12 and xmm3 holding
each 8 chars:
c0, c1, , c7
m0, m1, , m7
y0, y1, , y7
k0, k1, ., k7
And these need to be transposed/interleaved and stored like so:
c0 m0 y0 k0 c1 m1 y1 k1 c2 m2 y2 k2 c3 m3 y3 k3 ....
Reviewers
DavidKreitzer
Farhana
zvi
igorb
guyblank
RKSimon
Ayal
Differential Revision: https://reviews.llvm.org/D36058
Change-Id: I3cc5c2ca5d6318901c192a4428493b99ef424c32
llvm-svn: 314109
[Synopsys]
Using function elf::link(...) leads to segmentation fault on its second call. First call finishes correctly.
[Solution]
Clear the rest of globals.
Reviewed by: George Rimar and Rui Ueyama
Differential Revision: http://reviews.llvm.org/D38131
llvm-svn: 314108
As mentioned in https://reviews.llvm.org/D33718, this simply adds another
pattern to the compare elimination sequence and is committed without a
differential review.
llvm-svn: 314106
This test can't pass on MIPS64 due to the lack of versioned interceptors
for asan and company. The interceptors bind to the earlier version of
sem_init rather than the latest version. For MIPS64el this causes an
accidental pass while MIPS64 big endian fails due reading back a
different 32bit word to what sem_init wrote when the test is corrected
to use 64bit atomics.
llvm-svn: 314100
Previously`InX::Got` and InX::MipsGot synthetic sections
were not removed if ElfSym::GlobalOffsetTable was defined.
ElfSym::GlobalOffsetTable is a symbol for _GLOBAL_OFFSET_TABLE_.
Patch moves ElfSym::GlobalOffsetTable check out from removeUnusedSyntheticSections.
Also note that there was no point to check ElfSym::GlobalOffsetTable for MIPS case
because InX::MipsGot::empty() always returns false for non-relocatable case, and in case
of relocatable output we do not create special symbols anyways.
Differential revision: https://reviews.llvm.org/D37623
llvm-svn: 314099
When -verbose is specified, patch outputs names of each input orphan section
assigned to output.
Differential revision: https://reviews.llvm.org/D37517
llvm-svn: 314098
Previously when BC file had global variable that was accessed from script,
it was optimized away or inlined by IPO.
In this patch I add symbols at left side of assignment expression as LinkerRedefined,
what prevents optimization for them.
Differential revision: https://reviews.llvm.org/D37059
llvm-svn: 314097
Summary:
Right now there are two functions with the same name, one does the work
and the other one returns true if expansion is needed. Rename
TargetTransformInfo::expandMemCmp to make it more consistent with other
members of TargetTransformInfo.
Remove the unused Instruction* parameter.
Differential Revision: https://reviews.llvm.org/D38165
llvm-svn: 314096
We used to sort and uniquify CU vectors, but looks like CU vectors in
.gdb_index sections created by gold are not guaranteed to be sorted.
llvm-svn: 314095
We used to use std::set to uniquify CU vector elements, but as we know,
std::set is pretty slow. Fortunately we didn't actually have to use a
std::set here. This patch replaces it with std::vector.
With this patch, lld's -gdb-index overhead when linking a clang debug
build is now about 1 second (8.65 seconds without -gdb-index vs 9.60
seconds with -gdb-index). Since gold takes more than 6 seconds to create
a .gdb_index for the same output, our number isn't that bad.
llvm-svn: 314094
Previously, we had two levels of hash table lookup. The first hash
lookup uses CachedHashStringRefs as keys and returns offsets in string
table. Then, we did the second hash table lookup to obtain GdbSymbol
pointers. But we can directly map strings to GDbSymbols.
One test file is updated in this patch because we no longer have a '\0'
byte at the start of the string pool, which was automatically inserted
by StringTableBuilder.
This patch speeds up Clang debug build (with -gdb-index) link time by
0.3 seconds.
llvm-svn: 314092
This change alone speeds up linking of Clang debug build with -gdb-index
by 1.2 seconds, from 12.5 seconds to 11.3 seconds. (Without -gdb-index,
lld takes 8.5 seconds to link the same input files.)
llvm-svn: 314090
In order to keep track of symbol renaming, we used to have
Config->SymbolRenaming, and whether a symbol is in the map or not
affects its symbol attribute (i.e. "LinkeRedefined" bit).
This patch adds "CanInline" bit to Symbol to aggreagate symbol
information in one place and removed the member from Config since
no one except SymbolTable now uses the table.
llvm-svn: 314088