Bswap isn't a simple operation so we need to make sure we are really removing a call to it before doing these simplifications.
For the case when both LHS and RHS are bswaps I've allowed it to be moved if either LHS or RHS has a single use since that at least allows us to move it later where it might find another bswap to combine with and it decreases the use count on the other side so maybe the other user can be optimized.
Differential Revision: https://reviews.llvm.org/D34974
llvm-svn: 307273
When the formulae search space is huge, LSR uses a series of heuristic to keep
pruning the search space until the number of possible solutions are within
certain limit.
The big hammer of the series of heuristics is NarrowSearchSpaceByPickingWinnerRegs,
which picks the register which is used by the most LSRUses and deletes the other
formulae which don't use the register. This is a effective way to prune the search
space, but quite often not a good way to keep the best solution. We saw cases before
that the heuristic pruned the best formula candidate out of search space.
To relieve the problem, we introduce a new heuristic called
NarrowSearchSpaceByFilterFormulaWithSameScaledReg. The basic idea is in order to
reduce the search space while keeping the best formula, we want to keep as many
formulae with different Scale and ScaledReg as possible. That is because the central
idea of LSR is to choose a group of loop induction variables and use those induction
variables to represent LSRUses. An induction variable candidate is often represented
by the Scale and ScaledReg in a formula. If we have more formulae with different
ScaledReg and Scale to choose, we have better opportunity to find the best solution.
That is why we believe pruning search space by only keeping the best formula with the
same Scale and ScaledReg should be more effective than PickingWinnerReg. And we use
two criteria to choose the best formula with the same Scale and ScaledReg. The first
criteria is to select the formula using less non shared registers, and the second
criteria is to select the formula with less cost got from RateFormula. The patch
implements the heuristic before NarrowSearchSpaceByPickingWinnerRegs, which is the
last resort.
Testing shows we get 1.8% and 2% on two internal benchmarks on x86. llvm nightly
testsuite performance is neutral. We also tried lsr-exp-narrow and it didn't help
on the two improved internal cases we saw.
Differential Revision: https://reviews.llvm.org/D34583
llvm-svn: 307269
This was auto-generated using an older version of the script,
and that version does not work with phis, so if we enable
expansion it will go bad.
llvm-svn: 307267
Summary: Added MachineVerifier code to check register ties more thoroughly, especially so that physical registers that are tied are the same. This may help e.g. when creating MIR files.
Original patch by Jesper Antonsson
Reviewers: stoklund, sanjoy, qcolombet
Reviewed By: qcolombet
Subscribers: qcolombet, llvm-commits
Differential Revision: https://reviews.llvm.org/D34394
llvm-svn: 307259
- Put buildfiles into /tmp/clang-build/build, instead of /tmp/clang-build.
We checkout the sources to /tmp/clang-build/src and running
cmake in /tmp/clang-build was done by mistake.
- Don't add an extra ';' at the start of enabled projects list.
It worked either way, but looked strange.
- Minor comment update.
llvm-svn: 307258
Summary:
- Removed double indirection via command-line args (i.e. two `--`
options of `build_docker_image.sh`).
- Added a comment on how to build 2-stage clang install into the
`build_docker_image.sh`, it used to be only in the `docs/Docker.rst`.
Reviewers: klimek, mehdi_amini
Reviewed By: klimek
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35050
llvm-svn: 307256
The conversion to MatchTable left the function names and comments referring to
C++ statements and expressions. Updated the names and comments to account for
the fact that they're no longer unconstrained statements/expressions.
llvm-svn: 307248
Summary:
During remat, some subranges might end up having invalid segments which caused problems for later
coalescing.
Added in a check to remove segments that are invalidated as part of the remat.
See http://llvm.org/PR33524
Subscribers: MatzeB, qcolombet
Differential Revision: https://reviews.llvm.org/D34391
llvm-svn: 307247
The conversion to MatchTable left the function names and comments referring to
C++ statements and expressions. Updated the names and comments to account for
the fact that they're no longer unconstrained statements/expressions.
llvm-svn: 307246
It seems that the patch was reverted by mistake. Clang testing showed failure of the
MathExtras.SaturatingMultiply test, however I was unable to reproduce the issue on the
fresh code base and was able to confirm that the transformation introduced by the change
does not happen in the said test. This gives a strong confidence that the actual reason of
the failure of the initial patch was somewhere else, and that problem now seems to be
fixed. Re-submitting the change to confirm that.
llvm-svn: 307244
This covers both hard and soft float.
Hard float is easy, since it's just Legal.
Soft float is more involved, because there are several different ways to
handle it based on the predicate: one and ueq need not only one, but two
libcalls to get a result. Furthermore, we have large differences between
the values returned by the AEABI and GNU functions.
AEABI functions return a nice 1 or 0 representing true and respectively
false. GNU functions generally return a value that needs to be compared
against 0 (e.g. for ogt, the value returned by the libcall is > 0 for
true). We could introduce redundant comparisons for AEABI as well, but
they don't seem easy to remove afterwards, so we do different processing
based on whether or not the result really needs to be compared against
something (and just truncate if it doesn't).
llvm-svn: 307243
It is a bit unconvinent that client should implement this method
even if not use it. Patch provides default implementation.
Differential revision: https://reviews.llvm.org/D35009
llvm-svn: 307242
Summary:
As of this patch, 1018 out of 3938 rules are currently imported.
Depends on D32275
Reviewers: qcolombet, kristof.beyls, rovka, t.p.northover, ab, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: dberris, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D32278
llvm-svn: 307240
Also avoids ODR violations by ensuring names used in headers find the
same entity, not different, file-local entities in each translation
unit.
llvm-svn: 307237
Fix by Andrew Ng!
The Visual Studio build can contain output for multiple configuration types (
e.g. Debug, Release & RelWithDebInfo) within the same build output
directory. Therefore when discovering unit tests, the "build mode" sub directory
containing the appropriate configuration is included in the search. This sub
directory may not always be present, so a test for its existence is required.
Reviewers: zturner, modocache, dlj
Reviewed By: zturner, dlj
Subscribers: grimar, bd1976llvm, gbreynoo, edd, jhenderson, llvm-commits
Differential Revision: https://reviews.llvm.org/D34976
llvm-svn: 307235
Summary:
GlobalExtensions is dereferenced twice, once for iteration and then a check if it is empty.
As a ManagedStatic this dereference forces it's construction which is unnecessary.
Reviewers: efriedma, davide, mehdi_amini
Reviewed By: mehdi_amini
Subscribers: chapuni, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D33381
llvm-svn: 307229
This reverts commit ae21ee0b6cacbc1efaf4d42502e71da2f0eb45c3.
The initial revert was done in order to prevent ongoing errors on
chromium bots such as CrWinClangLLD. However, this was done haphazardly
and I didn't realize there were test and compilation failures, so this
revert was reverted. Now that those have been fixed, we can revert the
revert of the revert.
llvm-svn: 307227
This reverts commit 5fecbbbe5049665d86834cf69d8f75db4f392308.
The initial revert was done in order to prevent ongoing errors on
chromium bots such as CrWinClangLLD. However, this was done haphazardly
and I didn't realize there were test and compilation failures, so this
revert was reverted. Now that those have been fixed, we can revert the
revert of the revert.
llvm-svn: 307226
LLVM's definition of dominance allows instructions that are cyclic
in unreachable blocks, e.g.:
%pat = select i1 %condition, @global, i16* %pat
because any instruction dominates an instruction in a block that's
not reachable from entry.
So, remove unreachable blocks from the function, because a) there's
no point in analyzing them and b) GlobalOpt should otherwise grow
some more complicated logic to break these cycles.
Differential Revision: https://reviews.llvm.org/D35028
llvm-svn: 307215
If we are lowering a libcall after legalization, we'll split the return type into a pair of legal values.
Patch by Jatin Bhateja and Eli Friedman.
Differential Revision: https://reviews.llvm.org/D34240
llvm-svn: 307207
The dependence analysis was returning incorrect information when using the GEPs
to compute dependences. The analysis uses the GEP indices under certain
conditions, but was doing it incorrectly when the base objects of the GEP are
aliases, but pointing to different locations in the same array.
This patch adds another check for the base objects. If the base pointer SCEVs
are not equal, then the dependence analysis should fall back on the path
that uses the whole SCEV for the dependence check. This fixes PR33567.
Differential Revision: https://reviews.llvm.org/D34702
llvm-svn: 307203
If a method / function returns a StringRef but the
variable is of type const std::string& a temporary string is
created (StringRef has a cast operator to std::string),
which is a suboptimal behavior.
Differential revision: https://reviews.llvm.org/D34994
Test plan: make check-all
llvm-svn: 307195
Previously we were generating a void(void) function type
for a weak alias. Update the weak-alias test case to
catch this.
Differential Revision: https://reviews.llvm.org/D34734
llvm-svn: 307194
This reverts commit 600d52c278e123dd08bee24c1f00932b55add8de.
This patch still seems to break CrWinClangLLD, reverting until I can
find root problem.
llvm-svn: 307189
This patch still seems to break CrWinClangLLD, reverting this once more
until I can discover root problem.
This reverts commit 3dbbc8ce43be50ffde2b1c655c6d3a25796fe78b.
llvm-svn: 307188
We had a lot of one-off tests for this type and that type,
or "every type that happens to be generated by this program
I built". Eventually I got a bug report filed where we were
crashing on a type that was not covered by any of these tests.
So this test carefully constructs a minimal C++ program that
will cause every type we support to be emitted. This ensures
full coverage for type records.
Differential Revision: https://reviews.llvm.org/D34915
llvm-svn: 307187
On power 8 we sometimes insert swaps to deal with the difference between
Little-Endian and Big-Endian. The swap removal pass is supposed to clean up
these swaps. On power 9 we don't need this pass since we do not need to insert
the swaps in the first place.
Commiting on behalf of Stefan Pintilie.
Differential Revision: https://reviews.llvm.org/D34627
llvm-svn: 307185
For two ROTR operations with shifts C1, C2; combined shift operand will be (C1 + C2) % bitsize.
Differential revision: https://reviews.llvm.org/D12833
llvm-svn: 307179
This patch adds the exploitation for new power 9 instructions which extract
variable elements from vectors:
VEXTUBLX
VEXTUBRX
VEXTUHLX
VEXTUHRX
VEXTUWLX
VEXTUWRX
Differential Revision: https://reviews.llvm.org/D34032
Commit on behalf of Zaara Syeda (syzaara@ca.ibm.com)
llvm-svn: 307174
This patch adds on to the exploitation added by https://reviews.llvm.org/D33510.
This now catches build vector nodes where the inputs are coming from sign
extended vector extract elements where the indices used by the vector extract
are not correct. We can still use the new hardware instructions by adding a
shuffle to move the elements to the correct indices. I introduced a new PPCISD
node here because adding a vector_shuffle and changing the elements of the
vector_extracts was getting undone by another DAG combine.
Commit on behalf of Zaara Syeda (syzaara@ca.ibm.com)
Differential Revision: https://reviews.llvm.org/D34009
llvm-svn: 307169
Make it usable by any class derived (even indirectly) from
LoadedObjectInfo by allowing a custom base class to be specified and
perfect forwarding to the ctor.
llvm-svn: 307166
Summary:
Also, made a few minor tweaks to shave off a little more cumulative memory consumption:
* All rules share a single NewMIs instead of constructing their own. Only one
will end up using it.
* Use MIs.resize(1) instead of MIs.clear();MIs.push_back(I) and prevent
GIM_RecordInsn from changing MIs[0].
Depends on D33764
Reviewers: rovka, vitalybuka, ab, t.p.northover, qcolombet, aditya_nandakumar
Reviewed By: ab
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D33766
llvm-svn: 307159
This adds exact flags to AShr/LShr flags where we can statically
prove it is valid using the range of induction variables. This
allows further optimisations to remove extra loads.
Differential Revision: https://reviews.llvm.org/D34207
llvm-svn: 307157
Several integer multiply/divide instructions require use of a
register pair as input and output. This patch moves setting
up the input register pair from C++ code to TableGen, simplifying
the whole process and making it more easily extensible.
No functional change.
llvm-svn: 307155
Fixes a couple of whitespace errors, re-sorts the vector floating-point
instructions to make them more easily extensible, and adds a missing
pseudo instruction.
No functional change.
llvm-svn: 307154
Summary:
This is like the LLVM_TOOLS_INSTALL_DIR option, but for the utils
that are installed when the LLVM_INSTALL_UTILS. This option
defaults to 'bin' to remain consistent with the current behavior, but
distros may want to install these to libexec/llvm.
Reviewers: beanz
Reviewed By: beanz
Subscribers: llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D30655
llvm-svn: 307150
We used to have a helper that replaced an instruction with a libcall.
That turns out to be too aggressive, since sometimes we need to replace
the instruction with at least two libcalls. Therefore, change our
existing helper to only create the libcall and leave the instruction
removal as a separate step. Also rename the helper accordingly.
llvm-svn: 307149
This implements suggesting other mnemonics when an invalid one is specified,
for example:
$ echo "adXd r1,r2,#3" | llvm-mc -triple arm
<stdin>:1:1: error: invalid instruction, did you mean: add, qadd?
adXd r1,r2,#3
^
The implementation is target agnostic, but as a first step I have added it only
to the ARM backend; so the ARM backend is a good example if someone wants to
enable this too for another target.
Differential Revision: https://reviews.llvm.org/D33128
llvm-svn: 307148
r307133 brought back a couple instances of the same mistake that was already
fixed by r307088. Fixed it again.
Using NumPatternEmitted as a unique id for the tables is not valid on release
builds since the counters don't count in that case.
llvm-svn: 307146
Add a helper for building simple binary ops like add, mul, sub, and.
This can be used in the future for quickly adding support for or, xor.
llvm-svn: 307139
This patch seems to cause failures of test MathExtras.SaturatingMultiply on
multiple buildbots. Reverting until the reason of that is clarified.
Differential Revision: https://reviews.llvm.org/rL307126
llvm-svn: 307135
Summary:
This further improves the compile-time regressions that will be caused by a
re-commit of r303259.
Also added included preliminary work in preparation for the multi-insn emitter
since I needed to change the relevant part of the API for this patch anyway.
Depends on D33758
Reviewers: rovka, vitalybuka, ab, t.p.northover, qcolombet, aditya_nandakumar
Reviewed By: ab
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D33764
llvm-svn: 307133
-If there is a IndVar which is known to be non-negative, and there is a value which is also non-negative,
then signed and unsigned comparisons between them produce the same result. Both of those can be
seen in the same loop. To allow other optimizations to simplify them, we turn all instructions like
%c = icmp slt i32 %iv, %b
to
%c = icmp ult i32 %iv, %b
if both %iv and %b are known to be non-negative.
Differential Revision: https://reviews.llvm.org/D34979
llvm-svn: 307126
Relanding after rewriting undef.ll test to avoid host-dependant
endianness.
As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using
generic checks. Also, propagate missing local handling from there to
BaseIndexOffset checks.
Tests of note:
* test/CodeGen/X86/build-vector* - Improved.
* test/CodeGen/BPF/undef.ll - Improved store alignment allows an
additional store merge
* test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a
case we already do not handle well. Here, the DAG is improved, but
scheduling causes a code size degradation.
Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab
Subscribers: nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D34472
llvm-svn: 307114
getValueSitesForKind returns ArrayRef which has a cast operator
to std::vector, as a result a temporary vector is created
if the type of the variable is const std::vector&
that is suboptimal in this case.
Differential revision: https://reviews.llvm.org/D34970
Test plan: make check-all
llvm-svn: 307113
Original Patch and summary by Philip Reames.
RewriteStatepointsForGC tries to rewrite a function in a manner where
the optimizer can't end up using a pointer value after it might have
been relocated by a safepoint. This pass checks the invariant that
RSForGC is supposed to establish and that (if we constructed semantics
correctly) later passes must preserve.
This has been a really useful diagnostic tool when initially developing
the rewriting scheme and has found numerous bugs.
Differential Revision: https://reviews.llvm.org/D15940
Reviewed by: swaroop.sridhar, mjacob
Subscribers: llvm-commits
llvm-svn: 307112
We should rewrite this using the generic branch relaxation pass, but for
the moment having this pass is better than hitting an assertion error.
llvm-svn: 307109
Made some updates to the half.ll test under CodeGen to make it friendly to the update_llc_test_checks .py tool as follows:
1.Removing the llc flag -asm-verbose=false
2.Grouping the multiple check-prefix directives
3.Apply update_llc_test_checks.py tool on the test
This change is needed to easily update scheduling changes in an upcoming patch.
Reviewers: zvi, RKSimon, craig.topper
Differential Revision: https://reviews.llvm.org/D34934
llvm-svn: 307108
Using NumPatternEmitted as a unique id for the tables is not valid on release
builds since the counters don't count in that case.
Also fix an unused variable warning.
llvm-svn: 307088
Move from generic to X86 directory since gc intrinsics only supposed in
X86 64 bit.
Add target triple as well.
Fixes build failure in i686-linux-RA caused by rL307084.
llvm-svn: 307086
Summary:
We are crashing in LLC at O0 when gc intrinsics are present in the block.
The reason being FastISel performs basic block ISel by modifying GC.relocates
to be the first instruction in the block. This can cause us to visit the GC
relocate before it's corresponding GC.statepoint is visited, which is incorrect.
When we lower the statepoint, we record the base and derived pointers, along
with the gc.relocates. After this we can visit the gc.relocate.
This patch avoids fastISel from incorrectly creating the block with gc.relocate
as the first instruction.
Reviewers: qcolombet, skatkov, qikon, reames
Reviewed by: skatkov
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34421
llvm-svn: 307084
Summary:
Replace the matcher if-statements for each rule with a state-machine. This
significantly reduces compile time, memory allocations, and cumulative memory
allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is
recommitted.
The following patches will expand on this further to fully fix the regressions.
Reviewers: rovka, ab, t.p.northover, qcolombet, aditya_nandakumar
Reviewed By: ab
Subscribers: vitalybuka, aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33758
llvm-svn: 307079
Converting the Codegen test "extractelement-legalization-store-ordering.ll" to be "update_llc_test_checks" friendly.
The changes to the test are needed for an upcoming scheduling patch.
Reviewers: zvi, RKSimon
Differential Revision: https://reviews.llvm.org/D34935
llvm-svn: 307066
Record::getValues returns ArrayRef which has a cast operator
to std::vector, as a result a temporary vector is created
if the type of the variable is const std::vector&
that is suboptimal in this case.
Differential revision: https://reviews.llvm.org/D34969
Test plan: make check-all
llvm-svn: 307063
Summary:
When broadcasting from the constant pool its useful to print out the final vector similar to what we do for normal moves from the constant pool.
I changed only a couple tests that were broadcast focused. One of them had been previously hand tweaked after running the script so that it could check the constant pool declaration. But I think this patch makes that unnecessary now since we can check the comment instead.
Reviewers: spatel, RKSimon, zvi
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34923
llvm-svn: 307062
Summary: I believe this should be supported on GLM since RDSEED is.
Reviewers: m_zuckerman, zvi, RKSimon
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34828
llvm-svn: 307060
Record::getValues returns ArrayRef which has a cast operator
to std::vector, as a result a temporary vector is created
if the type of the variable is const std::vector&
that was suboptimal in this case.
Differential revision: https://reviews.llvm.org/D34969
Test plan: make check-all
llvm-svn: 307059
symbol resolver argument.
De-templatizing the symbol resolver is part of the ongoing simplification of
ORC layer API.
Removing the memory management argument (and delegating construction of memory
managers for RTDyldObjectLinkingLayer to a functor passed in to the constructor)
allows us to build JITs whose base object layers need not be compatible with
RTDyldObjectLinkingLayer's memory mangement scheme. For example, a 'remote
object layer' that sends fully relocatable objects directly to the remote does
not need a memory management scheme at all (that will be handled by the remote).
llvm-svn: 307058
Previously, if a basic block ended with a FRMIDX instruction, we would
end up doing something like this.
*std::next(MBB.end())
Which would hit an error:
"Assertion `!NodePtr->isKnownSentinel()' failed."
llvm-svn: 307057
The patch makes SoftenFloatResult/Operand logic just the same as all other legalization routines have: SoftenFloatResult() now fills the SoftenFloats map and SoftenFloatOperand() perform all needed replacements. This prevents softening mashinery from leaving stale entries in SoftenFloats map (that resulted in errors during the legalize type checking) and clarifies softening. The patch replaces https://reviews.llvm.org/D29265.
Differential Revision: https://reviews.llvm.org/D31946
llvm-svn: 307053
Summary:
This is a follow-up on D34077. Elena observed that the
correctness of the code relies on isPowerOf2(0) returning false.
Adding a test to cover this corner-case.
Reviewers: delena, davide, craig.topper
Reviewed By: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34939
llvm-svn: 307046
Summary:
Add a combine for creating a truncate to replace a build_vector composed of extracts with
indices that form a stride-2^N series.
Example:
v8i32 V = ...
v4i32 build_vector((extract_elt V, 0), (extract_elt V, 2), (extract_elt V, 4), (extract_elt V, 6))
-->
v4i32 truncate (bitcast V to v4i64)
Related discussion in llvm-dev about canonicalizing shuffles to
truncates in LLVM IR:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/108936.html.
Reviewers: spatel, RKSimon, efriedma, igorb, craig.topper, wolfgangp, delena
Reviewed By: delena
Subscribers: guyblank, delena, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D34077
llvm-svn: 307036
Summary: This makes it easier to find out which limitation prevented this pass from doing its work.
Reviewers: karthikthecool, mzolotukhin, efriedma, mcrosier
Reviewed By: mcrosier
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D34940
llvm-svn: 307035
These all used 'CHECK-NOT' which isn't necessary if we have complete checks.
There were also over-specifications in the RUN params such as CPU model.
llvm-svn: 307033
These all used 'CHECK-NOT' which isn't necessary if we have complete checks.
There were also several over-specifications in the RUN params such as CPU model or OS requirement
llvm-svn: 307028
This reverts commit r306313. This breaks selfhost at -O3 and PR33652.
Let me know if you need additional information on reproducing the issue.
llvm-svn: 307021