Now that we create the label at the point of the directive, we don't
need to set the "current CV location", and then later when we emit the
next instruction, create a label for it and emit it.
DWARF still defers the labels used in .debug_loc until the next
instruction or value, for reasons unknown.
llvm-svn: 340883
In Thumb1, legal imm range is [0, 255] for ADD/SUB instructions. However, the
legal imm range for LD/ST in (R+Imm) addressing mode is [0, 127]. Imms in
[128, 255] are materialized by mov R, #imm, and LD/STs use them in (R+R)
addressing mode.
This patch checks if a constant is used as offset in (R+Imm), if so, it checks
isLegalAddressingMode passing the constant value as BaseOffset.
Differential Revision: https://reviews.llvm.org/D50931
llvm-svn: 340882
Previously we followed the DWARF implementation, which waits until the
next instruction or data to emit the label to use in the .debug_loc
section. We might want to consider re-evaluating that design choice as
well, since it means the .loc skips alignment padding, for better or
worse.
This was the most minimal fix I could come up with, but we should be
able to do a lot of cleanups now that we don't need to save a pending CV
location on the CodeViewContext. I plan to do those next, but this
immediately fixes an assertion for some of our users.
llvm-svn: 340878
The new method name/behavior more closely models the way it was being used.
It also fixes an assertion that can occur when using the new ORC Core APIs,
where flags alone don't necessarily provide enough context to decide whether
the caller is responsible for materializing a given symbol (which was always
the reason this API existed).
The default implementation of getResponsibilitySet uses lookupFlags to determine
responsibility as before, so existing JITSymbolResolvers should continue to
work.
llvm-svn: 340874
Moving PassTimingInfo from legacy pass manager code into a separate header.
Making it suitable for both legacy and new pass manager.
Adding a test on -time-passes main functionality.
llvm-svn: 340872
The addObjectFile method adds the given object file to the JIT session, making
its code available for execution.
Support for the -extra-object flag is added to lli when operating in
-jit-kind=orc-lazy mode to support testing of this feature.
llvm-svn: 340870
These are intrinsics for supporting kadd builtins in clang. These builtins are already in gcc to implement intrinsics from icc. Though they are missing from the Intel Intrinsics Guide.
This instruction adds two mask registers together as if they were scalar rather than a vXi1. We might be able to get away with a bitcast to scalar and a normal add instruction, but that would require DAG combine smarts in the backend to recoqnize add+bitcast. For now I'd prefer to go with the easiest implementation so we can get these builtins in to clang with good codegen.
Differential Revision: https://reviews.llvm.org/D51370
llvm-svn: 340869
This can leave behind the uses with the defs removed.
Since this should only really happen in tests, it's not worth the
effort of trying to handle this.
llvm-svn: 340866
https://reviews.llvm.org/D51197
Currently, IRTranslator (and GISel) seems to be arbitrarily picking
which overflow intrinsics get mapped into opcodes which either have a
carry as an input or not.
For intrinsics such as Intrinsic::uadd_with_overflow, translate it to an
opcode (G_UADDO) which doesn't have any carry inputs (similar to LLVM
IR).
This patch adds 4 missing opcodes for completeness - G_UADDO, G_USUBO,
G_SSUBE and G_SADDE.
llvm-svn: 340865
Summary:
Add comments to help readers avoid having to read tablegen backends to
understand the code. Also remove unecessary breaks from the output.
Reviewers: dschuff, aheejin
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D51371
llvm-svn: 340864
The original motivating example uses a 64-bit add, so the carry
is used. Insert a copy from VCC. This may allow shrinking of
the used carry instruction. At worst, we are replacing a
mov to materialize the constant with a copy of vcc.
llvm-svn: 340862
Summary:
Port libFuzzer to windows-msvc.
This patch allows libFuzzer targets to be built and run on Windows, using -fsanitize=fuzzer and/or fsanitize=fuzzer-no-link. It allows these forms of coverage instrumentation to work on Windows as well.
It does not fix all issues, such as those with -fsanitize-coverage=stack-depth, which is not usable on Windows as of this patch.
It also does not fix any libFuzzer integration tests. Nearly all of them fail to compile, fixing them will come in a later patch, so libFuzzer tests are disabled on Windows until them.
Patch By: metzman
Reviewers: morehouse, rnk
Reviewed By: morehouse, rnk
Subscribers: morehouse, kcc, eraman
Differential Revision: https://reviews.llvm.org/D51022
llvm-svn: 340860
This needs to be done in the SSA fold operands
pass to be effective, so there is a bit of overlap
with SIShrinkInstructions but I don't think this
is practically avoidable.
llvm-svn: 340859
Summary:
The updated tests were previously infallible because the SIMD bitwise
operations do not contain vector types in their names.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D51369
llvm-svn: 340858
These instructions were added on the PentiumPro along with CMOV.
This was already comprehended by the lowering process which should emit an alternate sequence using FCOM and FNSTW. This just makes it an explicit error if that doesn't work for some reason.
llvm-svn: 340844
Summary:
Right now this test is failing on the builtbots on Windows but we have a very similar setup where the test passes. The test is meant to test that specifying a timeout works correctly by running an infnite loop and having it timeout - on the buildbot, the infinite loop doesn't actually execute. This change runs all of the tests in the set using an internal shell rather than an external shell. I expect this will make the test pass which means that either the way the external shell is invoked or the external shell setup on the buildbots is not correct. Regardless of whether the test passes with this change, we'll need to undo this change and have a real fix.
@gkistanova was able to get logs from the buildbot to rule out a number of theories as to why this test is failing, but they didn't have enough information to confirm exactly what the issue is. The purpose of this change is to narrow it down, but if someone has a local repro and can aid in debugging, that would make it much speedier (and less prone to making the bots fail).
Reviewers: gkistanova, asmith, zturner, modocache, rnk, delcypher
Reviewed By: rnk
Subscribers: delcypher, llvm-commits, gkistanova
Differential Revision: https://reviews.llvm.org/D51326
llvm-svn: 340840
Summary:
For assembly input files, generate debug info even when the .file
directive is present, provided it does not include a file-number
argument. Fixes PR38695.
Reviewers: probinson, sidneym
Subscribers: aprantl, hiraditya, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D51315
llvm-svn: 340839
CodeGenDAGPatterns::GenerateVariants is a costly function in many tblgen commands (33.87% of the total runtime of x86 -gen-dag-isel), and due to the O(N^2) nature of the function, there are a high number of repeated comparisons of the pattern's vector<Predicate>.
This initial patch at least avoids repeating these comparisons for every Variant in a pattern. I began investigating caching all the matches before entering the loop but hit issues with how best to store the data and how to update the cache as patterns were added.
Saves around 15secs in debug builds of x86 -gen-dag-isel.
Differential Revision: https://reviews.llvm.org/D51035
llvm-svn: 340837
Summary:
Sometimes reading an output *.ll file it is not easy to understand why some callsites are not inlined. We can read output of inline remarks (option --pass-remarks-missed=inline) and try correlating its messages with the callsites.
An easier way proposed by this patch is to add to every callsite processed by Inliner an attribute with the latest message that describes the cause of not inlining this callsite. The attribute is called //inline-remark//. By default this feature is off. It can be switched on by the option //-inline-remark-attribute//.
For example in the provided test the result method //@test1// has two callsites //@bar// and inline remarks report different inlining missed reasons:
remark: <unknown>:0:0: bar not inlined into test1 because too costly to inline (cost=-5, threshold=-6)
remark: <unknown>:0:0: bar not inlined into test1 because it should never be inlined (cost=never): recursive
It is not clear which remark correspond to which callsite. With the inline remark attribute enabled we get the reasons attached to their callsites:
define void @test1() {
call void @bar(i1 true) #0
call void @bar(i1 false) #2
ret void
}
attributes #0 = { "inline-remark"="(cost=-5, threshold=-6)" }
..
attributes #2 = { "inline-remark"="(cost=never): recursive" }
Patch by: yrouban (Yevgeny Rouban)
Reviewers: xbolva00, tejohnson, apilipenko
Reviewed By: xbolva00, tejohnson
Subscribers: eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D50435
llvm-svn: 340834
This is cleanup after newly introduced google/benchmark library
(rL340809). Many buildbots fail to identify regex engine support, so
this should presumably fix the issue.
llvm-svn: 340827
This patch also uses colors to highlight problematic wait-time entries.
A problematic entry is an entry with an high wait time that tends to match (or
exceed) the size of the scheduler's buffer.
Color RED is used if an instruction had to wait an average number of cycles
which is bigger than (or equal to) the size of the underlying scheduler's
buffer.
Color YELLOW is used if the time (in cycles) spend waiting for the
operands or pipeline resources is bigger than half the size of the underlying
scheduler's buffer.
Color MAGENTA is used if an instruction does not consume buffer resources
according to the scheduling model.
llvm-svn: 340825
ImmutableList used to require elements to have a copy constructor for no
good reason, this patch aims to fix this.
It also required but did not enforce its elements to be trivially
destructible, so a new static_assert is added to guard against misuse.
Differential Revision: https://reviews.llvm.org/D49985
llvm-svn: 340824
Summary:
This fixes PR31105.
There is code trying to delete dead code that does so by e.g. checking if
the single predecessor of a block is the block itself.
That check fails on a block like this
bb:
br i1 undef, label %bb, label %bb
since that has two (identical) predecessors.
However, after the check for dead blocks there is a call to
ConstantFoldTerminator on the basic block, and that call simplifies the
block to
bb:
br label %bb
Therefore we now do the call to ConstantFoldTerminator before the check if
the block is dead, so it can realize that it really is.
The original behavior lead to the block not being removed, but it was
simplified as above, and then we did a call to
Dest->replaceAllUsesWith(&*I);
with old and new being equal, and an assertion triggered.
Reviewers: chandlerc, fhahn
Reviewed By: fhahn
Subscribers: eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D51280
llvm-svn: 340820
Fix for the out-of-memory error when compiling SemaChecking.cpp
with GVNHoist and ubsan enabled. I've used a cache for inserted
CHIs to avoid excessive memory usage.
Differential Revision: https://reviews.llvm.org/D50323
llvm-svn: 340818
This patch creates the shift mask and actual shift using the vXi16 vector shift ops.
Differential Revision: https://reviews.llvm.org/D51263
llvm-svn: 340813
ompiling benchmark library (introduced in D50894) with the latest
bootstrapped Clang produces a lot of warnings, this issue was addressed
in the upstream patch I pushed earlier.
Upstream patch:
f85304e4e3
`README.LLVM` notes were updated to reflect the latest changes.
Reviewed by: lebedev.ri
Differential Revision: https://reviews.llvm.org/D51342
llvm-svn: 340811
This patch pulls google/benchmark v1.4.1 into the LLVM tree so that any
project could use it for benchmark generation. A dummy benchmark is
added to `llvm/benchmarks/DummyYAML.cpp` to validate the correctness of
the build process.
The current version does not utilize LLVM LNT and LLVM CMake
infrastructure, but that might be sufficient for most users. Two
introduced CMake variables:
* `LLVM_INCLUDE_BENCHMARKS` (`ON` by default) generates benchmark
targets
* `LLVM_BUILD_BENCHMARKS` (`OFF` by default) adds generated
benchmark targets to the list of default LLVM targets (i.e. if `ON`
benchmarks will be built upon standard build invocation, e.g. `ninja` or
`make` with no specific targets)
List of modifications:
* `BENCHMARK_ENABLE_TESTING` is disabled
* `BENCHMARK_ENABLE_EXCEPTIONS` is disabled
* `BENCHMARK_ENABLE_INSTALL` is disabled
* `BENCHMARK_ENABLE_GTEST_TESTS` is disabled
* `BENCHMARK_DOWNLOAD_DEPENDENCIES` is disabled
Original discussion can be found here:
http://lists.llvm.org/pipermail/llvm-dev/2018-August/125023.html
Reviewed by: dberris, lebedev.ri
Subscribers: ilya-biryukov, ioeric, EricWF, lebedev.ri, srhines,
dschuff, mgorny, krytarowski, fedor.sergeev, mgrang, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D50894
llvm-svn: 340809
Summary:
Correct to use set like behaviour of AllocType. Should check for
subset, not precise value.
Reviewers: theraven
Reviewed By: theraven
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D50959
llvm-svn: 340807
Summary:
I'm not sure if this patch is correct or if it needs more qualifying somehow. Bitcast shouldn't change the size of the load so it should be ok? We already do something similar for stores. We'll change the type of a volatile store if the resulting store is Legal or Custom. I'm not sure we should be allowing Custom there...
I was playing around with converting X86 atomic loads/stores(except seq_cst) into regular volatile loads and stores during lowering. This would allow some special RMW isel patterns in X86InstrCompiler.td to be removed. But there's some floating point patterns in there that didn't work because we don't fold (f64 (bitconvert (i64 volatile load))) or (f32 (bitconvert (i32 volatile load))).
Reviewers: efriedma, atanasyan, arsenm
Reviewed By: efriedma
Subscribers: jvesely, arsenm, sdardis, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, arichardson, jrtc27, atanasyan, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D50491
llvm-svn: 340797
This patch issues an error message if Darwin ABI is attempted with the PPC
backend. It also cleans up existing test cases, either converting the test to
use an alternative triple or removing the test if the coverage is no longer
needed.
Updated Tests
-------------
The majority of test cases were updated to use a different triple that does not
include the Darwin ABI. Many tests were also updated to use FileCheck, in place
of grep.
Deleted Tests
-------------
llvm/test/tools/dsymutil/PowerPC/sibling.test was originally added to test
specific functionality of dsymutil using an object file created with an old
version of llvm-gcc for a Powerbook G4. After a discussion with @JDevlieghere he
suggested removing the test.
llvm/test/CodeGen/PowerPC/combine_loads_from_build_pair.ll was converted from a
PPC test to a SystemZ test, as the behavior is also reproducible there.
All other tests that were deleted were specific to the darwin/ppc ABI and no
longer necessary.
Phabricator Review: https://reviews.llvm.org/D50988
llvm-svn: 340795
This causes crashes due to the interleaved dbg.value intrinsics being
left at the end of basic blocks, causing the actual terminators (br,
etc) to be not where they should be (not at the end of the block),
leading to later crashes.
Further discussion on the original commit thread.
This reverts commit r340368.
llvm-svn: 340794
verify*() methods are intended to have no side-effects (unless we detect
broken MSSA, in which case they assert()), and all of the other verify
methods are wrapped by `#ifndef NDEBUG`.
llvm-svn: 340793
This lines up with the behavior of an existing transform where if both
operands of the binop are shuffled, we allow moving the binop before the
shuffle regardless of whether the shuffle changes the size of the vector.
llvm-svn: 340787
Fix the issue of duplicating the call to `exp{,2}()` when it's nested in
`pow()`, as exposed by rL340462.
Differential revision: https://reviews.llvm.org/D51194
llvm-svn: 340784
The code that generates the loop definition operand for phis
in the epilog and kernel is incorrect in some cases.
In the kernel, when a phi refers to another phi, the code that
updates PhiOp2 needs to include the stage difference between
the two phis.
In the epilog, the check for using the loop definition instead
of the phi definition uses the StageDiffAdj value (the difference
between the phi stage and the loop definition stage), but the
adjustment is not needed to determine if the current stage
contains an iteration with the loop definition.
Differential Revision: https://reviews.llvm.org/D51167
llvm-svn: 340782
Summary:
The new stackification backend generates the giant switch statement
used to translate instructions to their stackified forms. I did this
because it was more interesting than adding all the different vector
versions of the various SIMD instructions to the switch statment
manually.
Reviewers: aardappel, aheejin, dschuff
Subscribers: mgorny, sbc100, jgravelle-google, sunfish, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D51318
llvm-svn: 340781
This patch removes the MSBuild warnings about options that
clang-cl ignores. It also adds several additional fields to
the LLVM Configuration options page. The first is that it
adds support for LLD! To give the user flexibility though,
we don't want to force LLD to always-on, and if we're not
forcing LLD then we might as well not force clang-cl either.
So we add options that can enable or disable lld, clang-cl,
or any combination of the two. Whenever one is disabled,
it falls back to the Microsoft equivalent.
Additionally, for each of clang-cl and lld-link, we add a new
configuration setting that allows Additional Options to be
passed for that specific tool only. This is similar to the
C/C++ > Command Line > Additional Options entry box, but
it serves the use case where a user switches back and forth
between the toolsets in their vcxproj, but where cl.exe
won't accept some options that clang-cl will. In this case
you can pass those options in the clang-cl additional options
and whenever clang-cl is disabled (or the other toolset is
selected entirely), those options won't get passed at all.
llvm-svn: 340780
This reverts r319889.
Unfortunately, wrapping flags are not a part of SCEV's identity (they
do not participate in computing a hash value or in equality
comparisons) and in fact they could be assigned after the fact w/o
rebuilding a SCEV.
Grep for const_cast's to see quite a few of examples, apparently all
for AddRec's at the moment.
So, if 2 expressions get built in 2 slightly different ways: one with
flags set in the beginning, the other with the flags attached later
on, we may end up with 2 expressions which are exactly the same but
have their operands swapped in one of the commutative N-ary
expressions, and at least one of them will have "sorted by complexity"
invariant broken.
2 identical SCEV's won't compare equal by pointer comparison as they
are supposed to.
A real-world reproducer is added as a regression test: the issue
described causes 2 identical SCEV expressions to have different order
of operands and therefore compare not equal, which in its turn
prevents LoadStoreVectorizer from vectorizing a pair of consecutive
loads.
On a larger example (the source of the test attached, which is a
bugpoint) I have seen even weirder behavior: adding a constant to an
existing SCEV changes the order of the existing terms, for instance,
getAddExpr(1, ((A * B) + (C * D))) returns (1 + (C * D) + (A * B)).
Differential Revision: https://reviews.llvm.org/D40645
llvm-svn: 340777
Normally we force Unix line endings in the repository, but since these are Windows files which are consumed by Microsoft tools that we don't have the source of, we should probably err on the side of caution and force CRLF.
llvm-svn: 340776
If the liveness of a physical register was invalid, this
was attempting to iterate the subregisters of all register
uses of the instruction, which would assert when it
encountered an implicit virtual register operand.
llvm-svn: 340763
Loosens an assert in getMemRIX16Encoding that restricts DQ-form instructions to
using an immediate, so that we can assemble instructions like lxv/stxv where the
offset is an expression.
Differential Revision: https://reviews.llvm.org/D51122
llvm-svn: 340761
We're using a 256-bit PACKUS to do the truncation, but that instruction operates on 128-bit lanes. So previously we shuffled first to rearrange the lanes. But that requires 2 shuffles. Instead we can shuffle after the PACKUS using a single VPERMQ. This matches what our normal LowerTRUNCATE code does when it uses PACKUS.
Differential Revision: https://reviews.llvm.org/D51284
llvm-svn: 340757
InstCombine mucks these up a bit. So we need to do some additional pattern matching to fix it. There are a still a few special cases not handled, but this covers the general case.
Differential Revision: https://reviews.llvm.org/D50952
llvm-svn: 340756
Summary:
This patch introduces llvm-mca as a library. The driver (llvm-mca.cpp), views, and stats, are not part of the library.
Those are separate components that are not required for the functioning of llvm-mca.
The directory has been organized as follows:
All library source files now reside in:
- `lib/HardwareUnits/` - All subclasses of HardwareUnit (these represent the simulated hardware components of a backend).
(LSUnit does not inherit from HardwareUnit, but Scheduler does which uses LSUnit).
- `lib/Stages/` - All subclasses of the pipeline stages.
- `lib/` - This is the root of the library and contains library code that does not fit into the Stages or HardwareUnit subdirs.
All library header files now reside in the `include` directory and mimic the same layout as the `lib` directory mentioned above.
In the (near) future we would like to move the library (include and lib) contents from tools and into the core of llvm somewhere.
That change would allow various analysis and optimization passes to make use of MCA functionality for things like cost modeling.
I left all of the non-library code just where it has always been, in the root of the llvm-mca directory.
The include directives for the non-library source file have been updated to refer to the llvm-mca library headers.
I updated the llvm-mca/CMakeLists.txt file to include the library headers, but I made the non-library code
explicitly reference the library's 'include' directory. Once we eventually (hopefully) migrate the MCA library
components into llvm the include directives used by the non-library source files will be updated to point to the
proper location in llvm.
Reviewers: andreadb, courbet, RKSimon
Reviewed By: andreadb
Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D50929
llvm-svn: 340755
Summary: We needed quotes around %python before to make python work correctly (on Windows) if the path contains spaces. I recently made a change so that %python now inherently has quotes, so now adding quotes around %python makes the test fail because the quotes cancel each other.
Reviewers: asmith, inglorion
Subscribers: mehdi_amini, eraman, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D51244
llvm-svn: 340753
In Bionic, open can be overloaded for _FORTIFY_SOURCE support, causing
compile errors of RetryAfterSignal due to overload resolution. Wrapping
the call in a lambda avoids this.
Based on a patch by Chih-Wei Huang <cwhuang@linux.org.tw>!
llvm-svn: 340751
Summary:
Made it convert from register to stack based instructions, and removed the registers.
Fixes to related code that was expecting register based instructions.
Added the correct testing flag to all tests, depending on what the
format they were expecting so far.
Translated one test to stack format as example: reg-stackify-stack.ll
tested:
llvm-lit -v `find test -name WebAssembly`
unittests/MC/*
Reviewers: dschuff, sunfish
Subscribers: sbc100, jgravelle-google, eraman, aheejin, llvm-commits, jfb
Differential Revision: https://reviews.llvm.org/D51241
llvm-svn: 340750
Before this patch, the SchedulerStatistics only printed the maximum number of
buffer entries consumed in each scheduler's queue at a given point of the
simulation.
This patch restructures the reported table, and adds an extra field named
"Average number of used buffer entries" to it.
This patch also uses different colors to help identifying bottlenecks caused by
high scheduler's buffer pressure.
llvm-svn: 340746
This commit has caused failures in some internal benchmarks. Temporarily
reverting this patch until the issue can be diagnosed and fixed.
llvm-svn: 340740
Summary:
This commit adds the case of tail calling a sret function from a non-sret
function when both functions have the C calling convention.
llvm-svn: 340737
Summary: If an object file ends with a relocation that is smaller
than 4 bytes we will write outside the Data array and trigger an
"Invalid index" assertion.
Reviewers: jyknight, venkatra
Reviewed By: jyknight
Subscribers: fedor.sergeev, jrtc27, llvm-commits
Differential Revision: https://reviews.llvm.org/D50971
llvm-svn: 340736
Summary:
Remove unnecessary lines from `sibcall.ll` and rename labels according
to @RKSimon's recommendations in the D45653 conversation.
llvm-svn: 340735
The internal benchmark failure reported by Google was due to a missing
check for the result type for the sign-extend and shift DAG. This commit
adds the check and re-commits the patch.
llvm-svn: 340734
Summary: The GR740 provides an up cycle counter in the registers ASR22
and ASR23. As these registers can not be read together atomically we only
use the value of ASR23 for llvm.readcyclecounter(). The ASR23 register
holds the 32 LSBs of the up-counter.
Reviewers: jyknight, venkatra
Reviewed By: jyknight
Subscribers: jfb, fedor.sergeev, jrtc27, llvm-commits
Differential Revision: https://reviews.llvm.org/D48638
llvm-svn: 340733
We have a class `ImplicitControlFlowTracking` which allows us to keep track of
instructions that can abnormally exit and answer queries like "whether or not
there is side-exiting instruction above this instruction in its block".
We may want to have the similar tracking for other types of "special" instructions,
for example instructions that write memory.
This patch separates ImplicitControlFlowTracking into two classes, isolating all
general logic not related to implicit control flow into its parent class. We can
later make another child of this class to keep track of instructions that write
memory.
The motivation for that is that we want to make these checks efficiently in the
patch https://reviews.llvm.org/D50891.
NOTE: The naming of the parent class is not super cool, but the other options we
have are hardly better. Please feel free to rename it as NFC if you think you've
found a more informative name for it.
Differential Revision: https://reviews.llvm.org/D50954
Reviewed By: fedor.sergeev
llvm-svn: 340728
The existing method is protected, and requires using DataRefImpl
and SmallVector.
Differential Revision: https://reviews.llvm.org/D50995
llvm-svn: 340725
Summary:
Currently bitcasting constants from f64 to v2i32 is done by storing the
value to the stack and then loading it again. This is not necessary, but
seems to happen because v2i32 is a valid type for Sparc V8. If it had not
been legal, we would have gotten help from the type legalizer.
This patch tries to do the same work as the legalizer would have done by
bitcasting the floating point constant and splitting the value up into a
vector of two i32 values.
Reviewers: venkatra, jyknight
Reviewed By: jyknight
Subscribers: glaubitz, fedor.sergeev, jrtc27, llvm-commits
Differential Revision: https://reviews.llvm.org/D49219
llvm-svn: 340723
We cannot directy reuse the patterns of StPat because for some reason the store
DAG node and the atomic_store_nn DAG nodes put the ptr and the value in
different positions. Currently we attempt to store the address to an address
formed by the value.
Differential Revision: https://reviews.llvm.org/D51217
llvm-svn: 340722
vXi32 support was recently moved from LowerMUL_LOHI to LowerMULH.
This commit shares the getOperand calls, switches both to use common IsSigned flag, and hoists the NumElems/NumElts variable.
llvm-svn: 340720
This is a pretty large refactor / re-write of the Microsoft
demangler. The previous one was a little hackish because it
evolved as I was learning about all the various edge cases,
exceptions, etc. It didn't have a proper AST and so there was
lots of custom handling of things that should have been much
more clean.
Taking what was learned from that experience, it's now
re-written with a completely redesigned and much more sensible
AST. It's probably still not perfect, but at least it's
comprehensible now to someone else who wants to come along
and make some modifications or read the code.
Incidentally, this fixed a couple of bugs, so I've enabled
the tests which now pass.
llvm-svn: 340710
Summary: This was inheriting the cost from the AVX table, but should be legal under AVX512.
Reviewers: RKSimon
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D51267
llvm-svn: 340708
Summary:
Previously most CPUs inherited cmov support through Feature64Bit(or FeatureCMPXCHG16HB implying Feature64Bit) or FeatureSSE1.
This has the surprising side effect that -mattr=-cmov causes an assert to fire in 64-bit mode because it clears the Feature64Bit. Or in 32-bit mode, -mattr=-cmov disables any sse/avx features which seems surprising.
This patch removes the implication and instead updates hasCMOV in X86Subtarget to check SSE1 or is64Bit in addition to the regular cmov flag. This should keep most things working the way they did before. I don't believe there is a way to specific "-cmov" directly from clang so this should only effect our lower level tools.
This does stop -mattr=cx16(cmpxchg16b) from implying cmov is enabled via the 64bit flag as you can see from one of the changed tests. But that was a 32-bit test so I don't know why it enabled cx16 anyway.
For the other test I had to add -sse to override the new sse check in hasCMOV.
Reviewers: RKSimon, DavidKreitzer, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits, jfb
Differential Revision: https://reviews.llvm.org/D51228
llvm-svn: 340707
Summary: This matches gcc and one cpuid dump I found online. Given that these are considered 7th generation x86 CPU it seems likely they support cmov since cmov was added by Intel in their 6th generation.
Reviewers: RKSimon, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D51264
llvm-svn: 340706
I noticed this along with the patterns in D51125, but when the index is variable,
we don't convert insertelement into a build_vector.
For x86, that means these get expanded at legalization time into the loading/spilling
code that we see in the tests. I think it's always better to avoid going to memory on
these, and we get the optimal 'broadcast' if it's available.
I suspect other targets may want to look at enabling the hook. AArch64 and AMDGPU have
regression tests that would be affected (although I did not check what would happen in
those cases). In the most basic cases shown here, AArch64 would probably do much
better with a splat.
Differential Revision: https://reviews.llvm.org/D51186
llvm-svn: 340705
Private symbols are not visible outside the object file, and so not defined by
the object file from ORC's perspective.
No test case yet. Ideally this would be a unit test parsing a checked-in binary,
but I am not aware of any way to reference the LLVM source root from a unit
test.
llvm-svn: 340703
vectors, and move this test code into an anonymous namespace.
Hoping that this will avoid hitting an MSVC bug that causes it to crash
and burn pretty spectacularly. Also, this degree of clever use of
initializer lists seems somewhat questionable in general. ;]
llvm-svn: 340702
This is a bit awkward in a handful of places where we didn't even have
an instruction and now we have to see if we can build one. But on the
whole, this seems like a win and at worst a reasonable cost for removing
`TerminatorInst`.
All of this is part of the removal of `TerminatorInst` from the
`Instruction` type hierarchy.
llvm-svn: 340701
agree with MSVC.
There isn't actually a need for specialization here as we can write the
code generically and just have a test that will fold away as a constant.
llvm-svn: 340700
`isExceptionalTermiantor` and implement it for opcodes as well following
the common pattern in `Instruction`.
Part of removing `TerminatorInst` from the `Instruction` type hierarchy
to make it easier to share logic and interfaces between instructions
that are both terminators and not terminators.
llvm-svn: 340699
The core get and set routines move to the `Instruction` class. These
routines are only valid to call on instructions which are terminators.
The iterator and *generic* range based access move to `CFG.h` where all
the other generic successor and predecessor access lives. While moving
the iterator here, simplify it using the iterator utilities LLVM
provides and updates coding style as much as reasonable. The APIs remain
pointer-heavy when they could better use references, and retain the odd
behavior of `operator*` and `operator->` that is common in LLVM
iterators. Adjusting this API, if desired, should be a follow-up step.
Non-generic range iteration is added for the two instructions where
there is an especially easy mechanism and where there was code
attempting to use the range accessor from a specific subclass:
`indirectbr` and `br`. In both cases, the successors are contiguous
operands and can be easily iterated via the operand list.
This is the first major patch in removing the `TerminatorInst` type from
the IR's instruction type hierarchy. This change was discussed in an RFC
here and was pretty clearly positive:
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123407.html
There will be a series of much more mechanical changes following this
one to complete this move.
Differential Revision: https://reviews.llvm.org/D47467
llvm-svn: 340698
Legalize G_ADD for types smaller than i32.
LegalizationArtifactCombiner replaces extend instructions with appropriate
bitwise instructions.
Patch by Petar Avramovic.
Differential Revision: https://reviews.llvm.org/D51213
llvm-svn: 340697
Summary: NameLen wasn't being used and caused the parameters in gdb to very long, in my case, crashes in others. Please also perform the correct magical incarnations to have this be applied to the LLVM 7 branch.
Reviewers: whitequark, CodaFi
Reviewed By: CodaFi
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D51141
llvm-svn: 340691
Summary:
The only time vector SMUL_LOHI/UMUL_LOHI nodes are created is during division/remainder lowering. If its created before op legalization, generic DAGCombine immediately turns that SMUL_LOHI/UMUL_LOHI into a MULHS/MULHU since only the upper half is used. That node will stick around through vector op legalization and will be turned back into UMUL_LOHI/SMUL_LOHI during op legalization. It will then be custom lowered by the X86 backend. Due to this two step lowering the vector shuffles created by the custom lowering get legalized after their inputs rather than before. This prevents the shuffles from being combined with any build_vector of constants.
This patch uses changes vXi32 to use MULHS/MULHU instead. This is what the later DAG combine did anyway. But by skipping the change back to UMUL_LOHI/SMUL_LOHI we lower it before any constant BUILD_VECTORS. This allows the vector_shuffle creation to constant fold with the build_vectors. This accounts for the test changes here.
Reviewers: RKSimon, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D51254
llvm-svn: 340690
Summary:
Previously the value being stored is the last operand in SDNode. This causes the type legalizer to visit the mask operand before the value operand. The type legalizer was more complicated because of this since we want the type of the value to drive the decisions.
This patch moves the value to be the first operand so we visit it first during type legalization. It also simplifies the type legalization code accordingly.
X86 is currently the only in tree target that uses this SDNode. Not sure if there are any users out of tree.
Reviewers: RKSimon, delena, hfinkel, eli.friedman
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D50402
llvm-svn: 340689
This is a preliminary step for a preliminary step for D50992.
I noticed that x86 often misses chances to load a scalar directly
into a vector register.
So this patch is just allowing more of those cases to match a
broadcast op in lowerBuildVectorAsBroadcast(). The old code comment
said it doesn't make sense to use a broadcast when we're loading a
single element and everything else is undef, but I think that's the
best case in the improved tests in insert-loaded-scalar.ll. We avoid
scalar-to-vector-register move and/or less efficient shuffling.
Note that there are some existing types that were already producing
a broadcast, but that happens semi-accidentally. Ie, it's not
happening as part of lowerBuildVectorAsBroadcast(). The build vector
gets expanded into load + shuffle, and then shuffle lowering produces
the broadcast.
Description of the other test diffs:
1. avx-basic.ll - replacing load+shufle is a win.
2. sse3-avx-addsub-2.ll - vmovddup vs. vbroadcastss is neutral
3. sse41.ll - don't care - we convert that intrinsic to generic IR now, so this test is deprecated
4. vector-shuffle-128-v8.ll / vector-shuffle-256-v16.ll - pshufb alternatives with an extra instruction are not obviously bad
Differential Revision: https://reviews.llvm.org/D51125
llvm-svn: 340685
Summary:
Patch by Marek Olsak and David Stuttard, both of AMD.
This adds a new amdgcn intrinsic supporting s.buffer.load, in particular
multiple dword variants. These are convenient to use from some front-end
implementations.
Also modified the existing llvm.SI.load.const intrinsic to common up the
underlying implementation.
This modification also requires that we can lower to non-uniform loads correctly
by splitting larger dword variants into sizes supported by the non-uniform
versions of the load.
V2: Addressed minor review comments.
V3: i1 glc is now i32 cachepolicy for consistency with buffer and
tbuffer intrinsics, plus fixed formatting issue.
V4: Added glc test.
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D51098
Change-Id: I83a6e00681158bb243591a94a51c7baa445f169b
llvm-svn: 340684
Summary:
If any of the bundled instructions are marked as FrameSetup
or FrameDestroy, then that property is set on the BUNDLE
instruction as well.
As long as the scheduler/packetizer aren't mixing
prologue/epilogue instructions (i.e. all the bundled
instructions have the same property) then this simply gives
the bundle the correct property (so when using a bundle
iterator in late passes a bundle will be correctly identified
as FrameSetup/FrameDestroy).
When for example bundling a mix of FrameSetup instructions
with non-FrameSetup instructions it could be discussed if
the bundle should have the property or not. The choice here
has been to set these properties on the BUNDLE instruction if
any of the bundled instructions have the property set.
Reviewers: #debug-info, kparzysz
Reviewed By: kparzysz
Subscribers: vsk, thegameg, llvm-commits
Differential Revision: https://reviews.llvm.org/D50637
llvm-svn: 340680
Summary:
When computeIntervals is looking through COPY instruction to
extend the location mapping for a debug variable it did not
handle subregisters correctly.
For example
DBG_VALUE debug-use %0.sub_8bit_hi, ...
%1:gr16 = COPY %0
was transformed into
DBG_VALUE debug-use %0.sub_8bit_hi, ...
%1:gr16 = COPY %0
DBG_VALUE debug-use %1, ...
So the subregister index was missing in the added DBG_VALUE.
As long as the subreg refered to the least significant bits
of the superreg, then I guess we could get the correct
result in a debugger even when referring to the superreg.
But as in the example above when the subreg refers to other
parts of the superreg, then debuginfo would be incorrect.
I'm not sure exactly how to fix this properly, so this patch
just avoids looking through the COPY when there is a subreg
involved (for more info, see the FIXME added in the code).
Reviewers: rnk, aprantl
Reviewed By: aprantl
Subscribers: JDevlieghere, llvm-commits
Tags: #debug-info
Differential Revision: https://reviews.llvm.org/D50788
llvm-svn: 340679
Summary:
Handle the case IDVal is an empty string.
This bug was uncovered by a LLVM MC Assembler Protocol Buffer
Fuzzer for the RISC-V assembly language.
Reviewers: rnk
Reviewed By: rnk
Subscribers: rnk, niravd, pcc, peter.smith, asb, grosbach, llvm-commits, bcain, kito-cheng, shiva0217, rogfer01, PkmX
Differential Revision: https://reviews.llvm.org/D50808
llvm-svn: 340678
HermitCore is a POSIX-compatible kernel for running a single application in an isolated environment to get maximum performance and predictable runtime behavior. It can either be used bare-metal on hardware or a VM (Unikernel) or side by side to an existing Linux system (Multikernel).
Due to the latter feature, HermitCore binaries are marked with ELFOSABI_STANDALONE to let the Linux ELF loader distinguish them from regular Unix/Linux binaries and load them using the HermitCore "proxy" tool.
Patch by Colin Finck!
llvm-svn: 340675
The function's new implementation from r340583 had a bug in it that
could cause an invalid scope to be generated when merging two
DILocations with no common ancestor scope.
This patch detects this situation and picks the scope of the first
location. This is not perfect, because the scope is misleading, but on
the other hand, this will be a line 0 location.
rdar://problem/43687474
Differential Revision: https://reviews.llvm.org/D51238
llvm-svn: 340672
demangling process when it does.
Use this to support a "lookup" query for the mangling canonicalizer that
does not create new nodes. This could also be used to implement
demangling with a fixed-size temporary storage buffer.
Reviewers: erik.pilkington
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D51003
llvm-svn: 340670
This node doesn't directly correspond to a mangled name fragment, so
it's useful to explicitly describe when it's created and what it's for.
llvm-svn: 340664
Summary:
Given a set of equivalent name fragments, this mechanism determines whether two
mangled names are equivalent. The intent is to use this for fuzzy matching of
profile data against the program after certain refactorings are performed.
Reviewers: erik.pilkington, dlj
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D50935
llvm-svn: 340663
Choosing to revert the change and do it again, hopefully preserving the history
of the changes by using svn copy instead of simply creating a new file from the
contents within Scheduler.
llvm-svn: 340661
Back in https://reviews.llvm.org/D19559, I tried to teach CVP about range facts implied by value/value icmps (i.e. no constants.) In the meantime, we've implemented the optimization, but I couldn't find tests checked in, so adding them.
llvm-svn: 340660