Summary:
Without using a fixup in this case, BL will be used instead of BLX to
call internal ARM functions from Thumb functions.
Reviewers: rafael, t.p.northover, peter.smith, kristof.beyls
Reviewed By: peter.smith
Subscribers: srhines, echristo, aemerson, rengolin, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33436
llvm-svn: 304413
Based on the original patch by Davide, but I've adjusted the API exposed
to just be different entry points rather than exposing more state
parameters. I've factored all the common logic out so that we don't have
any duplicate pipelines, we just stitch them together in different ways.
I think this makes the build easier to reason about and understand.
This adds a direct method for getting the module simplification pipeline
as well as a method to get the optimization pipeline. While not my
express goal, this seems nice and gives a good place comment about the
restrictions that are imposed on them.
I did make some minor changes to the way the pipelines are structured
here, but hopefully not ones that are significant or controversial:
1) I sunk the PGO indirect call promotion to only be run when we have
PGO enabled (or as part of the special ThinLTO pipeline).
2) I made the extra GlobalOpt run in ThinLTO just happen all the time
and at a slightly more powerful place (before we remove available
externaly functions). This seems like general goodness and not a big
compile time sink, so it didn't make sense to *only* use it in
ThinLTO. Fewer differences in the pipeline makes everything simpler
IMO.
3) I hoisted the ThinLTO stop point pre-link above the the RPO function
attr inference. The RPO inference won't infer anything terribly
meaningful pre-link (recursiveness?) so it didn't make a lot of
sense. But if the placement of RPO inference starts to matter, we
should move it to the canonicalization phase anyways which seems like
a better place for it (and there is a FIXME to this effect!). But
that seemed a bridge too far for this patch.
If we ever need to parameterize these pipelines more heavily, we can
always sink the logic to helper functions with parameters to keep those
parameters out of the public API. But the changes above seemed minor
that we could possible get away without the parameters entirely.
I added support for parsing 'thinlto' and 'thinlto-pre-link' names in
pass pipelines to make it easy to test these routines and play with them
in larger pipelines. I also added a really basic manifest of passes test
that will show exactly how the pipelines behave and work as well as
making updates to them clear.
Lastly, this factoring does introduce a nesting layer of module pass
managers in the default pipeline. I don't think this is a big deal and
the flexibility of decoupling the pipelines seems easily worth it.
Differential Revision: https://reviews.llvm.org/D33540
llvm-svn: 304407
Summary:
Add an early combine to match patterns such as:
(i16 bitcast (v16i1 x))
->
(i16 movmsk (v16i8 sext (v16i1 x)))
This combine needs to happen early enough before
type-legalization scalarizes the result of the setcc.
Reviewers: igorb, craig.topper, RKSimon
Subscribers: delena, llvm-commits
Differential Revision: https://reviews.llvm.org/D33311
llvm-svn: 304406
Summary: This pattern is no very useful per se, but it exposes optimization for toehr patterns that wouldn't kick in otherwize. It's very common and worth optimizing for.
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32756
llvm-svn: 304402
Summary: Also see D33429 for other ThinLTO + New PM related changes.
Reviewers: davide, chandlerc, tejohnson
Subscribers: mehdi_amini, Prazek, cfe-commits, inglorion, llvm-commits, eraman
Differential Revision: https://reviews.llvm.org/D33525
llvm-svn: 304378
Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB.
Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb
Reviewed By: MatzeB, andreadb
Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32563
llvm-svn: 304371
We should have a single call site entry with no landing pad. This
indicates that no EH action should be taken and the unwinder should
unwind to the next frame.
We currently don't recognize __gxx_personality_seh0 as a known
personality, so we forcibly emit a table, and that table was wrong. This
was filed as PR33220. Now we emit a correct table for that personality.
The next step is to recognize that we can completely skip the table for
this personality.
llvm-svn: 304363
Summary:
Don't assign values to undefined references, simply don't emit those
reference edges as they are not useful (we were already not emitting
call edges to undefined refs).
Also, streamline the later lookup of value ids when writing the
summaries, by combining the check for value id existence with the access
of that value id.
Reviewers: pcc
Subscribers: Prazek, llvm-commits, inglorion
Differential Revision: https://reviews.llvm.org/D33634
llvm-svn: 304323
Summary:
If we attempt to unfold an SUnit in ScheduleDAG that results in
finding an already scheduled load, we must should abort the
unfold as it will not improve scheduling.
This fixes PR32610.
Reviewers: jmolloy, sunfish, bogner, spatel
Subscribers: llvm-commits, MatzeB
Differential Revision: https://reviews.llvm.org/D32911
llvm-svn: 304321
This adds a callback to the LLVMTargetMachine that lets target indicate
that they do not pass the machine verifier checks in all cases yet.
This is intended to be a temporary measure while the targets are fixed
allowing us to enable the machine verifier by default with
EXPENSIVE_CHECKS enabled!
Differential Revision: https://reviews.llvm.org/D33696
llvm-svn: 304320
Fixes PPCTTIImpl::getCacheLineSize() returning the wrong cache line size for
newer ppc processors.
Commiting on behalf of Stefan Pintilie.
Differential Revision: https://reviews.llvm.org/D33656
llvm-svn: 304317
This reverts commit r304310.
It caused build failures in polly and mingw
due to undefined reference to
llvm::RTLIB::getMEMCPY_ELEMENT_ATOMIC.
llvm-svn: 304315
This patch does an inline expansion of memcmp.
It changes the memcmp library call into an inline expansion when the size is
known at compile time and is under a target specified threshold.
This expansion is implemented in CodeGenPrepare and expands into straight line
code. The target specifies a maximum load size and the expansion works by using
this size to load the two sources, compare, and exit early if a difference is
found. It also has a special case when the memcmp result is used in a compare
to zero equality.
Differential Revision: https://reviews.llvm.org/D28637
llvm-svn: 304313
- new waitcnt pass remains off by default; -enable-si-insert-waitcnts=1 to enable it
- fix handling of PERMUTE ops
- fix insertion of waitcnt instrs at function begin/end ( port of analogous code that was added to old waitcnt pass )
- add new test
Differential Revision: https://reviews.llvm.org/D33114
llvm-svn: 304311
Summary:
Expanding the loop idiom test for memcpy to also recognize unordered atomic memcpy.
The only difference for recognizing
an unordered atomic memcpy and instead of a normal memcpy is
that the loads and/or stores involved are unordered atomic operations.
Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html
Patch by Daniel Neilson!
Reviewers: reames, anna, skatkov
Reviewed By: reames
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D33243
llvm-svn: 304310
Correct references to alignment of store which may be deleted in a
previous iteration of merge. Instead use first store that would be
merged.
Corrects pr33172's use-after-poison caught by ASan.
Reviewers: spatel, hfinkel, RKSimon
Reviewed By: RKSimon
Subscribers: thegameg, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33686
llvm-svn: 304299
There are some VectorShuffle Nodes in SDAG which can be selected to XXPERMDI
Instruction, this patch recognizes them and does the selection to improve
the PPC performance.
Differential Revision: https://reviews.llvm.org/D33404
llvm-svn: 304298
This patch builds upon https://reviews.llvm.org/rL302810 to add
handling for bitwise logical operations in general purpose registers.
The idea is to keep the values in GPRs as long as possible - only
extracting them to a condition register bit when no further operations
are to be done.
Differential Revision: https://reviews.llvm.org/D31851
llvm-svn: 304282
Thanks to Galina Kistanova for finding the missing break!
When trying to make a test for this, I realized our logic for handling
extractvalue/insertvalue/... is somewhat broken. This makes constructing
a test-case for this missing break nontrivial.
llvm-svn: 304275
This test assumes that llc can infer a default triple. I'm not sure why
exactly, but the Verify MachineInstrs bot requires tests to be explicit
about this dependency.
This commit follows the lead from r248452 and adds in 'REQUIRES:
default_triple' to omit-empty.ll.
Bot URL: http://lab.llvm.org:8080/green/job/Verify-Machineinstrs_AArch64/7500
llvm-svn: 304269
This is the equivalent of r304048 for ARM:
- Rewrite livein calculation to use the computeLiveIns() helper
function. This is slightly less efficient but easier to reason about
and doesn't unnecessarily add pristine and reserved registers[1]
- Zero the status register at the beginning of the loop to make sure it
has a defined value.
- Remove kill flags of values that need to stay alive throughout the loop.
[1] An upcoming commit of mine will tighten the MachineVerifier to catch
these.
llvm-svn: 304267
Summary:
AntiDepBreaker intends to add all live-outs, including the implicit
CSRs, in StartBlock. r299124 was done without understanding that
intention.
Now with the live-ins propagated correctly (D32464), we can revert this change.
Reviewers: MatzeB, qcolombet
Subscribers: nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D33697
llvm-svn: 304251
There is no guarantee that the first use of a constant that is traversed
is actually the first in the related basic block. Thus, if we use that
as the insertion point we may end up with definitions that don't
dominate there use.
llvm-svn: 304244
This will fail if you configure with e.g. -Wno-unknown-warning-option.
Change it to check for 'warning:' just like we did for 'error:' in
r289484.
llvm-svn: 304239
r303763 caused build failures in some out-of-tree tests due to an assertion in
TTI. The original patch updated cost estimates for induction variable update
instructions marked for scalarization. However, it didn't consider that the
incoming value of an induction variable phi node could be a cast instruction.
This caused queries for cast instruction costs with a mix of vector and scalar
types. This patch includes a fix for cast instructions and the test case from
PR33193.
The fix was suggested by Jonas Paulsson <paulsson@linux.vnet.ibm.com>.
Reference: https://bugs.llvm.org/show_bug.cgi?id=33193
Original Differential Revision: https://reviews.llvm.org/D33457
llvm-svn: 304235
For multiplications of 64-bit values (giving 64-bit result), detect
cases where the arguments are sign-extended 32-bit values, on a per-
operand basis. This will allow few patterns to match a wider variety
of combinations in which extensions can occur.
llvm-svn: 304223
An encoding does not allow to use SDWA in an instruction with
scalar operands, either literals or SGPRs. That is however possible
to copy these operands into a VGPR first.
Several copies of the value are produced if multiple SDWA conversions
were done. To cleanup MachineLICM (to hoist copies out of loops),
MachineCSE (to remove duplicate copies) and SIFoldOperands (to replace
SGPR to VGPR copy with immediate copy right to the VGPR) runs are added
after the SDWA pass.
Differential Revision: https://reviews.llvm.org/D33583
llvm-svn: 304219