This adds the minimum necessary to support codegen for simple ALU operations
on RV32. Prolog and epilog insertion, support for memory operations etc etc
follow in future patches.
Leave guessInstructionProperties=1 until https://reviews.llvm.org/D37065 is
reviewed and lands.
Differential Revision: https://reviews.llvm.org/D29933
llvm-svn: 316188
x86 has its own copy of integer absolute pattern matching to combine directly to a SUB+CMOV.
This patch removes the x86 combine and adds custom lowering support for ISD::ABS instead, allowing us to use the DAGCombiner version.
Additional test cases are already covered by iabs.ll (rL315706 and rL315711).
Differential Revision: https://reviews.llvm.org/D38895
llvm-svn: 316162
While parameterising by XLen, also take the opportunity to clean up the
formatting of the RISCV .td files.
This commit unifies the in-tree code with my patchset at
<https://github.com/lowrisc/riscv-llvm>.
llvm-svn: 316159
The method IEEEFloat::convertFromStringSpecials() does not recognize
the "+Inf" and "-Inf" strings but these strings are printed for
the double Infinities by the IEEEFloat::toString().
This patch adds the "+Inf" and "-Inf" strings to the list of recognized
patterns in IEEEFloat::convertFromStringSpecials().
Reviewers: sberg, bogner, majnemer, timshen, rnk, skatkov, gottesmm, bkramer, scanon
Reviewed By: skatkov
Subscribers: apilipenko, reames, llvm-commits
Differential Revision: https://reviews.llvm.org/D38030
llvm-svn: 316156
Original commit message:
"[cmake] Use find_package to discover zlib
This allows us to use standard cmake utilities to point to non-system zlib
locations.
Patch by Oksana Shadura and me (D39002)."
The new patch brings back the old behavior in the cases where find_package
cannot find zlib.
llvm-svn: 316150
MergeFunctions uses (through FunctionComparator) a map of GlobalValues
to identifiers because it needs to compare functions and globals
do not have an inherent total order. Thus, FunctionComparator
(through GlobalNumberState) has a ValueMap<GlobalValue *>.
r315852 added a RAUW on globals that may have been previously
encountered by the FunctionComparator, which would replace
a GlobalValue * key with a ConstantExpr *, which is illegal.
This commit adjusts that code path to remove the function being
replaced from the ValueMap as well.
llvm-svn: 316145
LLVM checks if it is performing an in-source build and then stop the
build. However, this check is also triggered if LLVM is being build as
part of a parent project, which prevents the parent project itself from
using in-source builds. For example, CMake allows a parent project to
specify the output of its subproject:
add_subdirectory(llvm llvm_build)
This tells CMake to conduct an out-tree build of LLVM, which without
this patch will still fails because what is being tested is the parent
project, not LLVM. This is fixed by using the "CURRENT" variable, which
is only concerned by the CMakeLists that is actually bein processed at
the moment.
Tests:
Ran `make check-llvm`.
Patch by Henrique Jung <henriquenj_AT_gmail_DOT_com>
llvm-svn: 316142
LineCoverageIterator makes it easy for clients of coverage data to
determine line execution counts for a file or function. The coverage
iteration logic is tricky enough that it really pays not to have
multiple copies of it. Hopefully having just one implementation in LLVM
will make the iteration logic easier to test, reuse, and update.
This commit is NFC but I've added a unit test to go along with it just
because it's easy to do now.
llvm-svn: 316141
This runs `udpate_mir_test_checks --add-vreg-checks` on the tests taht
are already more or less in the format that generates, so that there
will be less churn in some upcoming changes.
llvm-svn: 316139
This converts a large and somewhat arbitrary set of tests to use
update_mir_test_checks. I ran the script on all of the tests I expect
to need to modify for an upcoming mir syntax change and kept the ones
that obviously didn't change the tests in ways that might make it
harder to understand.
llvm-svn: 316137
This is a temporary hack to support adding checks for the "registers:"
block of mir functions. This is necessary to convert a number of tests
so that there's less churn when we change the MIR printer to put the
vreg classes on defs instead of in their own block.
llvm-svn: 316134
llvm-cov tends to highlight too many regions because its policy is to
highlight all region entry segments. This can look confusing to users:
not all region entry segments are interesting and deserve highlighting.
Emitting these highlights only when the region count differs from the
line count is a more user-friendly policy.
llvm-svn: 316109
Instead of copying around the wrapped segment and the list of line
segments, just pass a reference to a LineCoverageStats object. This
simplifies the interface. It also makes an upcoming change to suppress
distracting highlights possible.
llvm-svn: 316108
llvm-cov typically doesn't highlight gap segments, but it should if the
gap occurs after an uncovered region in order to preserve continuity.
llvm-svn: 316107
This patch lets the llvm tools handle the new HVX target features that
are added by frontend (clang). The target-features are of the form
"hvx-length64b" for 64 Byte HVX mode, "hvx-length128b" for 128 Byte mode HVX.
"hvx-double" is an alias to "hvx-length128b" and is soon will be deprecated.
The hvx version target feature is upgated form "+hvx" to "+hvxv{version_number}.
Eg: "+hvxv62"
For the correct HVX code generation, the user must use the following
target features.
For 64B mode: "+hvxv62" "+hvx-length64b"
For 128B mode: "+hvxv62" "+hvx-length128b"
Clang picks a default length if none is specified. If for some reason,
no hvx-length is specified to llvm, the compilation will bail out.
There is a corresponding clang patch.
Differential Revision: https://reviews.llvm.org/D38851
llvm-svn: 316101
All loads of form V6_vL32b_{,cur,nt,tmp,nt_cur,nt_tmp}_{ai,pi,ppu} are
predicable on v62 (but not on v60). Mark them all as predicable in the
instruction definitions, and handle the v60 case in HII::isPredicable.
llvm-svn: 316098
r315275 set the IsLittleEndian parameter incorrectly. This patch corrects
this, and adds a test to ensure such mistakes will be caught in the future.
llvm-svn: 316091
An empty livein block doesn't make much sense (why not just omit it?)
but they're legal and some tests have them, so its best to handle it.
llvm-svn: 316089
Matching prefixes isn't good enough, because it leads to things like
calling the first constant C3 just because there were two copies
before it. Tighten up the check to match more precisely, but also be
careful about ambiguity when dealing with target opcodes that end in a
number.
llvm-svn: 316088
Fix a couple of tests that were extending the wrong vreg, and
regenerate their checks with update_mir_test_checks. This looks like
it was a copy-paste or test update error.
llvm-svn: 316087
In the case where there was a conditional branch followed by a unconditional
branch with debug instruction separating them, MipsInstrInfo::analyzeBranch
would not skip past debug instruction when searching for the second branch
which give erroneous results about the control flow of the block.
This could lead to the branch folder to merge the non-fall through case
into it's predecessor, leaving the conditional branch with a dangling
basic block operand.
This resolves PR34975.
Thanks to Alexander Richardson for reporting the issue!
Reviewers: atanasyan
Differential Revision: https://reviews.llvm.org/D39003
llvm-svn: 316084
Added check that type of CmpConst and source type of trunc are equal
for correct matching of the case when we can set widened C constant
equal to CmpConstant.
%cond = cmp iN %x, CmpConst
%tr = trunc iN %x to iK
%narrowsel = select i1 %cond, iK %t, iK C
Patch by: Gainullin, Artur <artur.gainullin@intel.com>
llvm-svn: 316082
bug fix 316067 https://bugs.llvm.org/show_bug.cgi?id=34978
This test checks that the x86-interleaved ends without any
assertion.
Change-Id: I1e970482a4d0404516cbc85517fc091bb21c35a8
llvm-svn: 316080
This patch adds accurate instructions cost.
The formula presents two cases(stride 3 and stride 4) and calculates the cost according to the VF and stride.
Reviewers:
1. delena
2. Farhana
3. zvi
4. dorit
5. Ayal
Differential Revision: https://reviews.llvm.org/D38762
Change-Id: If4cfbd4ac0e63694e8144cb78c7fa34850647ff7
llvm-svn: 316072
Helper functions to identify sign- and zero-extending machine instruction is introduced in rL315888.
This patch makes PPCInstrInfo::optimizeCompareInstr use the helper functions. It simplifies the code and also makes possible more optimizations since the helper can do more analysis than the original check code; I observed about 5000 more compare instructions are eliminated while building LLVM.
Also, this patch fixes a bug in helpers on ANDIo instruction handling due to the order of checks. This bug causes a failure in an existing test case for optimizeCompareInstr.
Differential Revision: https://reviews.llvm.org/D38988
llvm-svn: 316071
Summary:
When we have the following case:
%cond = cmp iN %x, CmpConst
%tr = trunc iN %x to iK
%narrowsel = select i1 %cond, iK %t, iK C
We could possibly match only min/max pattern after looking through cast.
So it is more profitable if widened C constant will be equal CmpConst.
That is why just set widened C constant equal to CmpConst, because there
is a further check in this function that trunc CmpConst == C.
Also description for lookTroughCast function was added.
Reviewers: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38536
Patch by: Artur Gainullin <artur.gainullin@intel.com>
llvm-svn: 316070
This is introduced in rL308711.
Check for c library is incorrect here just because libc will be found always
and it does not mean that iconv is presented.
Thank to Andrew Krasny for narrowing down the root cause.
Reviewers: ecbeckmann
Reviewed By: ecbeckmann
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D38875
llvm-svn: 316064
Summary:
llvm-cfi-verify (D38379) introduced a potential build failure when compiling with `-DLLVM_BUILD_LLVM_DYLIB=ON -DLLVM_LINK_LLVM_DYLIB=ON`. Specific versions of cmake seem to treat the `add_subdirectory()` rule differently. It seems as if old versions of cmake BFS these rules, adding them to the fringe for expansion later. Newer versions of cmake seem to immediately execute CMakeFiles that are present in this subdirectory.
If the subdirectory is expanded through the fringe, the globbing resultant from `llvm_add_implicit_projects()` from `cmake/modules/AddLLVM.cmake:1012` means that `tools/llvm-shlib/CMakeFile.txt` gets executed before `tools/llvm-cfi-verify/lib/CMakeFile.txt`. As the latter CMakeFile adds a new library, this expansion order means that the library files required the unit tests in `unittests/tools/llvm-cfi-verify/` are not present in the dynamic library. This causes unit tests to fail as the required functions can't be found.
This change now ensures that the libraries created by `llvm-cfi-verify` are statically linked into the unit tests. As `tools/llvm-cfi-verify/lib` no longer adds anything to `llvm-shlib`, there should be no concern about the order-of-compilation.
Reviewers: skatkov, pcc
Reviewed By: skatkov, pcc
Subscribers: llvm-commits, kcc, pcc, aheejin, vlad.tsyrklevich, mgorny
Differential Revision: https://reviews.llvm.org/D39020
llvm-svn: 316059
This adds update_mir_test_checks, which updates the check lines in mir
tests. This can only update tests that start and end with .mir
currently (ie, -run-pass) but it should be sufficient for updating at
least some of the GlobalISel tests.
llvm-svn: 316057
Summary:
If a compare instruction is same or inverse of the compare in the
branch of the loop latch, then return a constant evolution node.
Currently scope of evaluation is limited to SCEV computation for
PHI nodes.
This shall facilitate computations of loop exit counts in cases
where compare appears in the evolution chain of induction variables.
Will fix PR 34538
Reviewers: sanjoy, hfinkel, junryoungju
Reviewed By: junryoungju
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D38494
llvm-svn: 316054
When more than one Module is imported into the same context, such as during
an LTO build before linking the modules, ODR type uniquing may cause types
to point to a different CU. This check does not make sense in this case.
This fixes the error reported in PR34944.
https://bugs.llvm.org/show_bug.cgi?id=34944
rdar://problem/34940685
This reapplies a cleaner implementation of r316049.
llvm-svn: 316052
When more than one Module is imported into the same context, such as during
an LTO build before linking the modules, ODR type uniquing may cause types
to point to a different CU. This check does not make sense in this case.
This fixes the error reported in PR34944.
https://bugs.llvm.org/show_bug.cgi?id=34944
rdar://problem/34940685
llvm-svn: 316049
Summary:
std::unordered_multimap happens to be very slow when the number of elements
grows large. On one of our internal applications we observed a 17x compile time
improvement from changing it to DenseMap.
Reviewers: mehdi_amini, serge-sans-paille, davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38916
llvm-svn: 316045
The new scheme should match the normalization of embedded paths in
linkrepro tar files.
Differential Revision: https://reviews.llvm.org/D39023
llvm-svn: 316044
Note that cyclone itself doesn't fuse, but newer apple chips do and we
are using cyclone as the default when targeting apple OSes.
The current code also does not capture all fusion patterns of apple CPUs
yet; I am still looking for ways to refactor the code nicely to extend
it.
llvm-svn: 316036
If the address of a local is used in a comparison, AArch64 can fold the
address-calculation into the comparison via "adds". Unfortunately, a couple of
places (both hit in this one test) are not ready to deal with that yet and just
assume the first source operand is a register.
llvm-svn: 316035
Move the prune logic in pruneOverlaps to a new function, prune. This lets us
reuse the prune functionality. Makes the code a bit more readable. It'll also
make it easier to emit remarks/debug statements for pruned functions.
llvm-svn: 316031
This commit moves the decrement logic for outlined functions into the class,
and makes OccurrenceCount private. It can now be accessed via
getOccurrenceCount().
This makes it more difficult to accidentally introduce bugs by incorrectly
decrementing the occurrence count on OutlinedFunctions.
llvm-svn: 316020
Cleanup to Candidate that moves all end index calculations into
Candidate.endIdx(). For the sake of consistency, StartIdx and Len are now
private members, and can be accessed with length() and startIdx() respectively.
llvm-svn: 316019
NFC.
Added the Broadwell cpu and the BROADWELL prefix to all the scheduling regression tests, as part of prepartion for a larger commit of adding all Broadwell scheduiling.
Reviewers: RKSimon, zvi, aaboud
Differential Revision: https://reviews.llvm.org/D38994
Change-Id: I54bc9065168844c107b1729fcdc1d311ce3ea0a9
llvm-svn: 315998
Summary:
ValueTracking was recognizing not all variations of clamp. Swapping of
true value and false value of select was added to fix this problem. This
change breaks the canonical form of cmp inside the matchMinMax function,
that is why additional checks for compare predicates is needed. Added
corresponding test cases.
Reviewers: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38531
Patch by: Artur Gainullin <artur.gainullin@intel.com>
llvm-svn: 315992
Summary:
It seems that negative offset was accidentally allowed in D17967.
AFAICT small negative offset should be valid (always raise segfault) on all archs that I'm aware of (especially x86, which is the only one with this optimization enabled) and such case can be useful when loading hiden metadata from an object.
However, like the positive side, it should only be done within a certain limit.
For now, use the same limit on the positive side for the negative side.
A separate option can be added if needs appear.
Reviewers: mcrosier, skatkov
Reviewed By: skatkov
Subscribers: sanjoy, llvm-commits
Differential Revision: https://reviews.llvm.org/D38925
llvm-svn: 315991
Summary:
Make sure the map is cleared before processing a new module. Similar to what is done on `StackMaps`.
This issue is similar to D38588, though this time for FaultMaps (on x86) rather than ARM/AArch64. Other than possible mixing of information between modules, the crash is caused by the pointers values in the map that was allocated by the bump pointer allocator that is unwinded when emitting the next file. This issue has been around since 3.8.
This issue is likely much harder to write a test for since AFAICT it requires emitting something much more compilcated (and possibly real code) instead of just some random bytes.
Reviewers: skatkov, sanjoy
Reviewed By: skatkov, sanjoy
Subscribers: sanjoy, aemerson, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D38924
llvm-svn: 315990
Updated the scheduling information for the SkylakeClient target with the following changes:
1. regrouped the instructions after adding load and store latencies.
2. regrouped the instructions after adding identified missing ports in several groups.
The changes were made after revisiting the latencies impact of all the load and store uOps.
Reviewers: zvi, RKSimon, craig.topper
Differential Revision: https://reviews.llvm.org/D38727
Change-Id: I778a308cc11e490e8fa5e27e2047412a1dca029f
llvm-svn: 315978
This patch reverts rL315440 because of the bug described at
https://bugs.llvm.org/show_bug.cgi?id=34937
The fix for the bug is on review as D38944, but not yet ready. Given this is a regression reverting until a fix is ready is called for.
Max would have done the revert himself, but is having trouble doing a build of fresh LLVM for some reason. I did the build and test to ensure the revert worked as expected on his behalf.
llvm-svn: 315974
The right way to parse arch names is by creating a triple. This was
using getArchTypeForLLVMName before, which doesn't really do the right
thing here.
llvm-svn: 315965
We want to be writing a 32bit value, so we should be writing 4 bytes
instead of 2.
Patch by Alex Langford <apl@fb.com>.
Differential Revision: https://reviews.llvm.org/D38872
llvm-svn: 315964
In r315960, I accidentally assumed that the first line segment is
guaranteed to be the non-gap region entry segment (given that one is
present). It can actually be any segment on the line, and the test I
checked in demonstrates that.
llvm-svn: 315963
This reverts commit r315713. It causes PR34968.
I think I know what the problem is, but I don't think I'll have time to fix it
this week.
llvm-svn: 315962
Gap areas make it possible to correctly determine when to use counts
from deferred regions. Before gap areas were introduced, llvm-cov needed
to use a heuristic to do this: it ignored counts from segments that
start, but do not end, on a line. This heuristic breaks down on a simple
example (see PR34962).
This patch removes the heuristic and picks counts from any region entry
segment which isn't a gap area.
llvm-svn: 315960
We add /usr/local/include to the include directory list for some BSD
systems. We should mark this as a system directory to avoid files from
/usr/local/include getting picked over files shipping with llvm.
This typically manifested as gtest headers installed with the system
getting picked over the ones shipping with llvm.
Patch by Petr Penzin <penzin.dev@gmail.com>
Differential Revision: https://reviews.llvm.org/D37415
llvm-svn: 315952
This reverts commit r315823, thus re-applying r315781.
Also make sure we don't use G_BITCAST mapping for non-generic registers.
Non-generic registers don't have a type but do have a reg bank.
Something the COPY mapping now how to deal with but the G_BITCAST
mapping don't.
-- Original Commit Message --
We use to resort on the generic implementation to get the mappings for
COPYs. The generic implementation resorts on table lookup and
dynamically allocated objects to get the valid mappings.
Given we already know how to map G_BITCAST and have the static mappings
for them, use that code path for COPY as well. This is much more
efficient.
Improve the compile time of RegBankSelect by up to 20%.
Note: When we eventually generate all the mappings via TableGen, we
wouldn't have to do that dance to shave compile time. The intent of this
change was to make sure that moving to static structure really pays off.
NFC.
llvm-svn: 315947
This patch adds a new kind of metadata that indicates the possible callees of
indirect calls.
Differential Revision: https://reviews.llvm.org/D37354
llvm-svn: 315944
This will prevent doubling of line endings when parsing assembly and
emitting assembly.
Otherwise we'd parse the directive, consume the end of statement, hit
the next end of statement, and emit a fresh newline.
llvm-svn: 315943
static __global int Var = 0;
__global int* Ptr[] = {&Var};
...
In this case Var is a non premptable symbol and so its address can be used as the value of Ptr, with a base relative relocation that will add the delta between the ELF address and the actual load address. Such relocations do not require a symbol.
Differential Revision: https://reviews.llvm.org/D38909
llvm-svn: 315935
MSVC doesn't seem to like implicitly instantiating addPredicate and then
explicitly specializing it later. It causes an internal compiler error.
llvm-svn: 315930
This patch adds the ability to perform IPSCCP-like interprocedural analysis to
the generic sparse propagation solver. The patch gives clients the ability to
define their own custom LatticeKey types that the generic solver maps to custom
LatticeVal types. The custom lattice keys can be used, for example, to
distinguish among mappings for regular values, values returned from functions,
and values stored in global variables. Clients are responsible for defining how
to convert between LatticeKeys and LLVM Values by providing a specialization of
the LatticeKeyInfo template.
The added unit tests demonstrate how the generic solver can be used to perform
a simplified version of interprocedural constant propagation.
Differential Revision: https://reviews.llvm.org/D37353
llvm-svn: 315919
above PHI instructions.
ARC optimizer has an optimization that moves a call to an ObjC runtime
function above a phi instruction when the phi has a null operand and is
an argument passed to the function call. This optimization should not
kick in when the runtime function is an objc_release that releases an
object with precise lifetime semantics.
rdar://problem/34959669
llvm-svn: 315914
Previously these instructions were marked codegen only and had
an under-specified instruction description that did not record the
fcc register.
Reviewers: atanasyan, abeserminji
Differential Revision: https://reviews.llvm.org/D38847
llvm-svn: 315905
Minor addition and follow up of r314773 and r311533: this adds more
debug messages to the type legalizer. For each node, it dumps
legalization info for results and operands nodes, rather than just the
final legalized node.
Differential Revision: https://reviews.llvm.org/D38726
llvm-svn: 315904
Currently llvm-dwarfdump runs into llvm_unreachable when
faces DW_CFA_GNU_args_size. Patch implements the support.
Differential revision: https://reviews.llvm.org/D38879
llvm-svn: 315897
Summary:
The following transformation for cmp instruction:
icmp smin(x, PositiveValue), 0 -> icmp x, 0
should only be done after checking for min/max to prevent infinite
looping caused by a reverse canonicalization. That is why this
transformation was moved to place after the mentioned check.
Reviewers: spatel, efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38934
Patch by: Artur Gainullin <artur.gainullin@intel.com>
llvm-svn: 315895
We came across an llvm bug when compiling some testcases that 64-bit
immediates are silently truncated into 32-bit and then packed into
BPF_JMP | BPF_K encoding. This caused comparison with wrong value.
This bug looks to be introduced by r308080. The Select_Ri pattern is
supposed to be lowered into J*_Ri while the latter only support 32-bit
immediate encoding, therefore Select_Ri should have similar immediate
predicate check as what J*_Ri are doing.
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 315889
This patch enables redundant sign- and zero-extension elimination in PowerPC MI Peephole pass.
If the input value of a sign- or zero-extension is known to be already sign- or zero-extended, the operation is redundant and can be eliminated.
One common case is sign-extensions for a method parameter or for a method return value; they must be sign- or zero-extended as defined in PPC ELF ABI.
For example of the following simple code, two extsw instructions are generated before the invocation of int_func and before the return. With this patch, both extsw are eliminated.
void int_func(int);
void ii_test(int a) {
if (a & 1) return int_func(a);
}
Such redundant sign- or zero-extensions are quite common in many programs; e.g. I observed about 60,000 occurrences of the elimination while compiling the LLVM+CLANG.
Differential Revision: https://reviews.llvm.org/D31319
llvm-svn: 315888
Summary:
iPTR is a pointer of subtarget-specific size to any address space. Therefore
type checks on this size derive the SizeInBits from a subtarget hook.
At this point, we can import the simplests G_LOAD rules and select load
instructions using them. Further patches will support for the predicates to
enable additional loads as well as the stores.
The previous commit failed on MSVC due to a failure to convert an
initializer_list to a std::vector. Hopefully, MSVC will accept this version.
Depends on D37457
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37458
llvm-svn: 315887
Summary:
iPTR is a pointer of subtarget-specific size to any address space. Therefore
type checks on this size derive the SizeInBits from a subtarget hook.
At this point, we can import the simplests G_LOAD rules and select load
instructions using them. Further patches will support for the predicates to
enable additional loads as well as the stores.
Depends on D37457
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37458
llvm-svn: 315885
Summary:
This includes some context-sensitivity in the MVT to LLT conversion so that
pointer types are tested correctly.
FIXME: I'm not happy with the way this is done since everything is a
special-case. I've yet to find a reasonable way to implement it.
select-load.mir fails because <1 x s64> loads in tablegen get priority over s64
loads. This is fixed in the next patch and as such they should be committed
together, I've posted them separately to help with the review.
Depends on D37456
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37457
llvm-svn: 315884
This allows lld-link to process /manifestinput: flags on macOS too.
Also makes the `REQUIRES: manifesttool` lld tests run on macOS.
Setting LLVM_ENABLE_LIBXML2 to off can suppress this behavior, like on Linux.
llvm-svn: 315873
Summary:
It's possible for a ComplexPattern to be used as an operator in a match
pattern. This is used by the load/store patterns in AArch64 to name the
suboperands returned by ComplexPattern predicate so that they can be broken
apart and referenced independently in the result pattern.
This patch adds support for this in order to enable the import of load/store
patterns.
Depends on D37445
Hopefully fixed the ambiguous constructor that a large number of bots reported.
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D37456
llvm-svn: 315869
Summary:
It's possible for a ComplexPattern to be used as an operator in a match
pattern. This is used by the load/store patterns in AArch64 to name the
suboperands returned by ComplexPattern predicate so that they can be broken
apart and referenced independently in the result pattern.
This patch adds support for this in order to enable the import of load/store
patterns.
Depends on D37445
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D37456
llvm-svn: 315863
Turns out we have no patterns on the instructions that were using this feature flag for other reasons. These instructions are slow on all modern CPUs so it seems unlikely that we will spend any effort supporting these instructions going forward. So we might as well just kill of the feature flag and just fix up the comments.
llvm-svn: 315862
Summary:
This was impeding our ability to combine the extending shuffles with other shuffles as you can see from the test changes.
There's one special case that needed to be added to use VZEXT directly for v8i8->v8i64 since the custom lowering requires v64i8.
Reviewers: RKSimon, zvi, delena
Reviewed By: delena
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38714
llvm-svn: 315860
Summary: I see nothing in Agner Fog's tables to indicate that this improved between Ivy Bridge and Haswell. It's also set for all Atom CPUs so I assume KNL should have it too.
Reviewers: RKSimon, zvi, gadi.haber
Reviewed By: gadi.haber
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38890
llvm-svn: 315859
In type inference, an empty type set for a specific hw mode is not an
error. In earlier stages of the design it was, but having to use non-
parameterized types with target intrinsics necessarily led to type
contradictions: since the intrinsics used specific types, they were
only valid for a specific hw mode, and the resulting type set for other
modes ended up empty. To accommodate the existence of such intrinsics
individual type sets were allowed to be empty as long as not all sets
were empty.
llvm-svn: 315858
This can result in significant code size savings in some cases,
e.g. an interrupt table all filled with the same assembly stub
in a certain Cortex-M BSP results in code blowup by a factor of 2.5.
Differential Revision: https://reviews.llvm.org/D34806
llvm-svn: 315853
This reduces code size for constructs like vtables or interrupt
tables that refer to functions in global initializers.
Differential Revision: https://reviews.llvm.org/D34805
llvm-svn: 315852
This avoid code duplication and allow us to add the disable unroll metadata elsewhere.
Differential Revision: https://reviews.llvm.org/D38928
llvm-svn: 315850
Summary:
It's better to use our shuffle lowering code to handle these than loading an immediate into a k-register.
It really feels like this should be a DAG combine optimization rather than a lowering operation, but that's a problem for another day.
Reviewers: RKSimon, delena, zvi
Reviewed By: delena
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38932
llvm-svn: 315849
We should be able to fold constant conditions by converting to shuffles, but fixing that would break these tests in their current form. Since they are really trying to test masking ops, add a non-constant mask to the selects.
llvm-svn: 315848
Summary:
There is an important mismatch between ISD::LOAD and G_LOAD (and likewise for
ISD::STORE and G_STORE). In SelectionDAG, ISD::LOAD is a non-atomic load
and atomic loads are handled by a separate node. However, this is not true of
GlobalISel's G_LOAD. For G_LOAD, the MachineMemOperand indicates the atomicity
of the operation. As a result, this mapping must also add a predicate that
checks for non-atomic MachineMemOperands.
This is NFC since these nodes always have predicates in practice and are
therefore always rejected at the moment.
Depends on D37443
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: kristof.beyls, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37445
llvm-svn: 315843
Summary:
GlobalISel and SelectionDAG require different code for the common
load/store predicates due to differences in the representation.
For example:
SelectionDAG: (load<signext,i8>:i32 GPR32:$addr) // The <> denote properties of the SDNode that are not printed in the DAG
GlobalISel: (G_SEXT:s32 (G_LOAD:s8 GPR32:$addr))
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.
This patch moves the implementation of the common load/store predicates
into tablegen so that it can handle these differences.
It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.
Depends on D36618
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Subscribers: llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37443
Includes a partial revert of r315826 since this patch makes it necessary for
getPredCode() to return a std::string and getImmCode() should have the same
interface as getPredCode().
llvm-svn: 315841
Fixes cfe/trunk/test/Misc/backend-resource-limit-diagnostics.cl
test after r315808
We may hit few other similar issues, but I want to discuss good
solution offline.
llvm-svn: 315830
- Update docs to match llvm coding style
- Add missing FP16_OVFL bit for gfx9
- Fix the size of the kernel descriptor in the docs
Differential Revision: https://reviews.llvm.org/D38902
llvm-svn: 315822
- Do not allow amd_amdgpu_isa directives on non-amdgcn architectures
- Do not allow amd_amdgpu_hsa_metadata on non-amdhsa OSes
- Do not allow amd_amdgpu_pal_metadata on non-amdpal OSes
Differential Revision: https://reviews.llvm.org/D38750
llvm-svn: 315812
- Emit NT_AMD_AMDGPU_ISA
- Add assembler parsing for isa version directive
- If isa version directive does not match command line arguments, then return error
Differential Revision: https://reviews.llvm.org/D38748
llvm-svn: 315808
If we are applying a byte mask to a value extracted from a shuffle, see if we can combine the mask into shuffle.
Fixes the last issue with PR22415
llvm-svn: 315807
We don't need a bitconvert as a root pattern in these cases. The types in the other parts of the pattern are sufficient to express the behavior of these instructions.
llvm-svn: 315798
I believe these were added incorrectly under the belief that the load size was smaller than the input register size, but that's not true.
llvm-svn: 315795
"No such file or directory: C:\\...\\tests\\Output\\shared-output.py.tmp/Output/Shared/SHARED.tmp"
And yet other forward-slashes don't seem to be causing the same
problem. I'll see if I can get ahold of a Windows machine to poke at
this directly later.
llvm-svn: 315792
Currently llvm assembler emits parsing error for valid IR assembly
alloca i32, i32 9, addrspace(5)
when alloca addr space is 5.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D38713
llvm-svn: 315791
Summary:
This patch removes the `verifyNCD` check.
The reason for this is that the other checks are sufficient to prove or disprove correctness of any DominatorTree, and that `verifyNCD` doesn't provide (in my option) better error messages then the other ones.
Additionally, this should give a (small) improvement to the total verification time, as the check is O(n), and checking the sibling property takes O(n^3).
Reviewers: dberlin, grosser, davide, brzycki
Reviewed By: brzycki
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38802
llvm-svn: 315790
There were two copies of the logic needed to construct a line stats
object for each line in a range: this patch brings it down to one. In
the future, this will make it easier for IDE clients to display coverage
in-line in source editors. To do that, we just need to move the new
LineCoverageIterator class to libCoverage.
llvm-svn: 315789
We use to resort on the generic implementation to get the mappings for
COPYs. The generic implementation resorts on table lookup and
dynamically allocated objects to get the valid mappings.
Given we already know how to map G_BITCAST and have the static mappings
for them, use that code path for COPY as well. This is much more
efficient.
Improve the compile time of RegBankSelect by up to 20%.
Note: When we eventually generate all the mappings via TableGen, we
wouldn't have to do that dance to shave compile time. The intent of this
change was to make sure that moving to static structure really pays off.
NFC.
llvm-svn: 315781
Summary:
Operand variable lookups are now performed by the RuleMatcher rather than
searching the whole matcher hierarchy for a match. This revealed a wrong-code
bug that currently affects ARM and X86 where patterns that use a variable more
than once in the match pattern will be imported but won't check that the
operands are identical. This can cause the tablegen-erated matcher to
accept matches that should be rejected.
Depends on D36569
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Subscribers: aemerson, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36618
llvm-svn: 315780
This is particularly important for AVX512VL where we are better able to recognize the VBROADCAST loads to fold with other operations.
For AVX512VL we now use X86ISD::VBROADCAST for all of the patterns and remove the 128-bit X86ISD::VMOVDDUP.
We may be able to use this for AVX1 as well which would allow us to remove more isel patterns.
I also had to add X86ISD::VBROADCAST as a node to call combineShuffle for so that we treat it similar to X86ISD::MOVDDUP.
Differential Revision: https://reviews.llvm.org/D38836
llvm-svn: 315768
Summary:
There's only a tablegen testcase for IntImmLeaf and not a CodeGen one
because the relevant rules are rejected for other reasons at the moment.
On AArch64, it's because there's an SDNodeXForm attached to the operand.
On X86, it's because the rule either emits multiple instructions or has
another predicate using PatFrag which cannot easily be supported at the
same time.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36569
llvm-svn: 315761
TargetRegisterInfo::getMinimalPhysRegClass is actually pretty expensive
because it has to iterate over all the register classes.
Cache this information as we need and get it so that we limit its usage.
Right now, we heavily rely on it, because this is how we get the mapping
for vregs defined by copies from physreg (i.e., the one that are ABI
related).
Improve compile time by up to 10% for that pass.
NFC
llvm-svn: 315759
Prior to this patch we used to create SetVectors in temporaries that
were created and destroyed for each instruction. Now, instead we create
and destroyed them only once, but clear the content for each
instruction.
This speeds up the pass by ~25%.
NFC.
llvm-svn: 315756
It is possible for both a base and a derived class to be satisfied
with a unique vtable. If a program contains casts of the same pointer
to both of those types, the CFI checks will be lowered to this
(with ThinLTO):
if (p != &__typeid_base_global_addr)
trap();
if (p != &__typeid_derived_global_addr)
trap();
The optimizer may then use the first condition combined
with the assumption that __typeid_base_global_addr and
__typeid_derived_global_addr may not alias to optimize away the second
comparison, resulting in an unconditional trap.
This patch fixes the bug by giving imported globals the type [0 x i8]*,
which prevents the optimizer from assuming that they do not alias.
Differential Revision: https://reviews.llvm.org/D38873
llvm-svn: 315753
Summary:
The purpose of this patch is to expose more information about ImmLeaf-like
PatLeaf's so that GlobalISel can learn to import them. Previously, ImmLeaf
could only be used to test int64_t's produced by sign-extending an APInt.
Other tests on immediates had to use the generic PatLeaf and extract the
constant using C++.
With this patch, tablegen will know how to generate predicates for APInt,
and APFloat. This will allow it to 'do the right thing' for both SelectionDAG
and GlobalISel which require different methods of extracting the immediate
from the IR.
This is NFC for SelectionDAG since the new code is equivalent to the
previous code. It's also NFC for FastISel because FastIselShouldIgnore is 1
for the ImmLeaf subclasses. Enabling FastIselShouldIgnore == 0 for these new
subclasses will require a significant re-factor of FastISel.
For GlobalISel, it's currently NFC because the relevant code to import the
affected rules is not yet present. This will be added in a later patch.
Depends on D36086
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: bjope, aemerson, rengolin, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36534
llvm-svn: 315747
This is only currently used for mad/fma transforms.
This is the only case where it should be used for AMDGPU,
so add an opcode to be sure.
llvm-svn: 315740
Each constant extender requires an extra instruction, which adds to the
code size and also reduces the number of available slots in an instruction
packet. In most cases, the value of a repeated constant extender could be
loaded into a register, and the instructions using the extender could be
replaced with their counterparts that use that register instead.
This patch adds a pass that tries to reduce the number of constant
extenders, including extenders which differ only in an immediate offset
known at compile time, e.g. @global and @global+12.
llvm-svn: 315735
I'm about to commit a patch that makes them necessary for getPredCode() and
it would be strange for getPredCode() and getImmCode() to require different
usage.
llvm-svn: 315733
This adds Intel's Knights Mill CPU to valid CPU names for the backend. For now its an alias of "knl", but ultimately we need to support AVX5124FMAPS and AVX5124VNNIW instruction sets for it.
Differential Revision: https://reviews.llvm.org/D38811
llvm-svn: 315722
This patch moves some common utility functions out of IPSCCP and makes them
available globally. The functions determine if interprocedural data-flow
analyses can propagate information through function returns, arguments, and
global variables.
Differential Revision: https://reviews.llvm.org/D37638
llvm-svn: 315719
Summary: This version of tests should be working properly.
Reviewers: vsk
Reviewed By: vsk
Differential Revision: https://reviews.llvm.org/D38889
llvm-svn: 315714
Summary:
This change uses the loop use list added in the previous change to remember the
loops that appear in the trip count expressions of other loops; and uses it in
forgetLoop. This lets us not scan every loop in the function on a forgetLoop
call.
With this change we no longer invalidate clear out backedge taken counts on
forgetValue. I think this is fine -- the contract is that SCEV users must call
forgetLoop(L) if their change to the IR could have changed the trip count of L;
solely calling forgetValue on a value feeding into the backedge condition of L
is not enough. Moreover, I don't think we can strengthen forgetValue to be
sufficient for invalidating trip counts without significantly re-architecting
SCEV. For instance, if we have the loop:
I = *Ptr;
E = I + 10;
do {
// ...
} while (++I != E);
then the backedge taken count of the loop is 9, and it has no reference to
either I or E, i.e. there is no way in SCEV today to re-discover the dependency
of the loop's trip count on E or I. So a SCEV client cannot change E to (say)
"I + 20", call forgetValue(E) and expect the loop's trip count to be updated.
Reviewers: atrick, sunfish, mkazantsev
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D38435
llvm-svn: 315713
This refers to a temporary path that can be shared across all tests,
identified by a particular label. This can be used for things like
caches.
At the moment, the character set for the LABEL is limited to C
identifier characters, plus '-', '+', '=', and '.'. This is the same
set of characters currently allowed in REQUIRES clause identifiers.
llvm-svn: 315697
Summary:
In RS4GC it is possible that a base pointer is contained in a vector that
has undergone a bitcast from one element-pointertype to another. We teach
RS4GC how to look through bitcasts of vector types when looking for a base
pointer.
Reviewers: anna
Reviewed By: anna
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38849
llvm-svn: 315694
Summary: We seem to inconsistently create CMOV nodes some with a Glue result and some without. But I can't find any cases that use the Glue result. So I've tried to remove all the place that did this.
Reviewers: RKSimon, spatel, zvi
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38664
llvm-svn: 315686
Summary:
Documentation says that user can specify sources for both "show" and
"report" commands. "Show" command respects specified sources, but "report" does
not. It is useful to have both "show" and "report" generated for specified
sources. Also added tests to for both commands with sources specified.
Reviewers: vsk, kcc
Reviewed By: vsk
Differential Revision: https://reviews.llvm.org/D38860
llvm-svn: 315685
This patch adds timestamp verification for swiftmodule files. A new flag
is provided to allows us to disable this check in order to allow testing
of this feature.
Differential revision: https://reviews.llvm.org/D38686
llvm-svn: 315684
Summary:
This patch teaches SCEV to calculate the maxBECount when the end bound
of the loop can vary. Note that we cannot calculate the exactBECount.
This will only be done when both conditions are satisfied:
1. the loop termination condition is strictly LT.
2. the IV is proven to not overflow.
This provides more information to users of SCEV and can be used to
improve identification of finite loops.
Reviewers: sanjoy, mkazantsev, silviu.baranga, atrick
Reviewed by: mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38825
llvm-svn: 315683
Significantly reduces performancei (~30%) of gipfeli
(https://github.com/google/gipfeli)
I have not yet managed to reproduce this regression with the open-source
version of the benchmark on github, but will work with others to get a
reproducer to you later today.
llvm-svn: 315680
Minor doc update that the FileCheck matcher supports POSIX ERE.
It also fixes a minor issue in the regexp describing a variable
name: underscores are allowed too.
Differential Revision: https://reviews.llvm.org/D38787
llvm-svn: 315679
There's no advantage to using these instructions when they aren't masked. This enables some additional execution domain switching without needing to update the table.
llvm-svn: 315674
Summary:
Currently we do not correctly invalidate memoized results for add recurrences
that were created directly (i.e. they were not created from a `Value`). This
change fixes this by keeping loop use lists and using the loop use lists to
determine which SCEV expressions to invalidate.
Here are some statistics on the number of uses of in the use lists of all loops
on a clang bootstrap (config: release, no asserts):
Count: 731310
Min: 1
Mean: 8.555150
50th %time: 4
95th %tile: 25
99th %tile: 53
Max: 433
Reviewers: atrick, sunfish, mkazantsev
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D38434
llvm-svn: 315672
I don't know if we ever hit this case or not. Turning it into an assert only fired on expanding some atomic operation in a SystemZ lit test.
llvm-svn: 315648